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1  Introduction 

To  extend  artifact  and/or  draft  functionality  in  Ei-L,  you  must  first  have  a  working  knowledge 
of  how  E-L  is  used.  E-L  is  extended  from  within,  by  using  the  system  to  develop  new 
functionality  and  then  to  augment  the  plexes  that  describe  the  structure  of  the  artifacts 
system. 

This  guide  assumes  you  are  familiar  with  the  E-L  Users  ’  Manual.  To  add  a  tool  to  the 
system,  you  will  also  need  to  know  how  to  develop  C  or  Common  Lisp  programs  using  E-L, 
so  you  should  also  have  read  either  the  C  and  Unix  Artifacts  Manual  or  the  Common  Lisp 
Artifacts  Manual. 

Artifact  and  draft  behavior.  There  are  several  ways  to  extend  artifact  and  draft  func¬ 
tionality. 

•  Add  a  new  artifact  type. 

•  Add  a  new  derivative  class. 

•  Add  a  new  tool,  a  deriver,  for  computing  a  derivative  (or  set  of  derivatives)  from  an 
artifact. 

•  Add  a  new  tool,  a  non-deriver  that  uses  derivatives,  but  does  not  directly  compute 
them  (e.g.,  print  or  export  to  the  file  system). 

The  first  two  are  quite  independent  avenues  of  extension;  the  third  is  usually  an  adjunct 
of  one  or  both  of  the  others,  but  not  necessarily;  the  fourth  is  usually  an  adjunct  of  the 
second.  A  new  artifact  type  gives  users  a  new  way  to  record  primary  information  (such  eis  a 
source  program  or  a  manuscript  markup)  for  consumption  by  tools.  A  new  derivative  class 
gives  tools  a  new  way  to  record  derived  information  for  exchange  with  other  tools  and  for 
consumption  by  users.  A  good  extender  will  focus  on  reusability  of  artifact  representations 
and  of  derivative  classes.  Tools  whose  inputs  are  derivatives  from  a  broadly  applicable  class 
can  be  used  without  modification  as  new  types  are  added. 

The  aspects  of  extension  enumerated  above  are  the  subjects  of  the  remaining  chapters  of 
this  guide.  The  rest  of  this  introduction  is  meant  to  motivate  those  chapters  for  first-time 
extenders. 

Adding  a  type.  Different  types  of  artifacts  are  structured  differently  and  they  present 
themselves  quite  differently  in  the  user  interface.  These  differences  carry  over  into  the 
persistent  representations  of  artifacts  and  drafts  as  well.  The  variation  is  achieved  by  a  set  of 
extension  facilities  in  the  Epoch-based  user  interface  of  the  system,  which  is  implemented  in 
Em  ACS  Lisp.  These  facilities  are  structured  to  make  common  extensions  ecisy  with  little  or  no 
programming,  while  allowing  nitty-gritty  control  when  it’s  needed.  There  is,  for  example,  a 
connection  to  a  template-driven  utility  for  representation  translation  that  simplifies  handling 
of  structured  artifact  headers  and  other  such  forms.  Among  other  advantages,  this  makes 
it  easy  to  do  some  syntax  checking  before  a  draft  is  committed,  so  that  drafts  without  the 
right  gross  structure  don’t  even  become  artifacts. 
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Adding  a  derivative  class.  Technically,  a  derivative  is  a  string  associated  with  an  artifact; 
it  contains  or  it  points  to  information  of  a  particular  kind  about  the  artifact.  Often,  a 
derivative  includes  the  name  of  a  file,  relative  to  a  repository  subdirectory  that  holds  derived 
information  of  that  kind.  For  example,  c-module  artifacts  have  derivatives  of  kind  c-f  ile. 
A  typical  such  derivative  is  the  string  142 .  c,  which  names  a  file  in  the  repository  subdirectory 
for  c-file’s.  That  file  contains  C  code  that  has  been  extracted  from  the  module  and  its 
references  and  put  into  a  form  suitable  for  compilation. 

Derivative  kinds  fall  into  categories,  called  derivative  classes.  The  members  of  a  class 
share  the  same  methods  for  operations  like  display  (via  the  exeunine-derivative  command) 
and  deletion  (via  delete-derivative,  for  instance).  An  example  of  a  derivative  class  is 
object,  the  class  of  kinds  covering  compiled  C  object  files.  The  kinds  in  cleiss  object 
have  names  like  objectXsunS  and  objectXmips-prof  ile  because  different  target  machine 
architectures  and  different  compilation  options  produce  different  kinds  of  object  files.  Every 
derivative  kind  must  have  a  class,  even  if  it’s  a  singleton  whose  name  is  the  same  cis  the  kind 
it  contains.  (The  kind  c-f  ile  is  the  lone  member  of  the  class  c-f  ile.)  You  add  a  class  by 
registering  it  in  the  derivative-classes  plex,  described  in  chapter  3. 

Tools  and  Workers.  This  document  assumes  that  as  a  user  of  the  current  E-L  imple¬ 
mentation,  you  have  acquired  a  working  understanding  of  how  background  processing  is 
managed.  You  know,  for  example,  that  background  tasks  are  called  jobs.  You  know  about 
the  scheduling  plexes  called  past,  present,  and  future,  which  describe  jobs  already  com¬ 
pleted,  jobs  currently  in  progress,  and  jobs  waiting  to  be  started.  You  are  at  least  vaguely 
aware  that  the  present  plex  identifies  processes  called  job  servers  that  appear  to  take  jobs 
(albeit  selectively)  from  the  future  plex,  work  on  them  for  a  while,  and  move  them  into  the 
past  plex. 

A  job  is  based  on  a  tool,  which  names  the  function  that  is  run  in  the  background.  Ex¬ 
tensions  to  the  user  interface,  by  contrast,  are  called  commands.  Tools  are  divided  into  two 
categories,  those  that  compute  a  derivative  (or  set  of  derivatives)  from  an  artifact,  called  de¬ 
rivers,  and  non-derivers.  E-L  provides  tool  writers  with  a  simple  interface  that  allows  them 
to  ignore  the  problems  of  scheduling  jobs,  locking  and  reading  plexes,  and  so  on.  Although 
a  tool  is  activated  by  a  job  server  process,  it’s  not  actually  linked  into  the  job  server  exe¬ 
cutable.  Tools  are  collected  in  executable  files  called  workers.  You  may  either  create  a  new 
worker  for  a  new  tool  that  you  write,  or  plug  your  tool  into  an  existing  worker.  Problems 
like  communicating  with  the  job  server,  handling  internal  errors,  and  managing  heap  storage 
are  taken  care  of  by  a  nucleus  common  to  all  workers  in  a  given  implementation  language. 
A  worker  process  is  normally  spawned  as  an  inferior  process  by  a  job  server  the  first  time  it 
needs  to  call  one  of  the  worker’s  tools.  But  to  debug  a  tool,  you  can  reverse  the  situation 
by  starting  a  worker  process  and  then  giving  it  its  own  job  server  to  act  as  interface  to  the 
artifacts  system. 

Workers  and  derivers  are  registered  in  plexes  called  workers  and  derivers.  These  are 
used  by  the  scheduling  algorithm  to  determine  the  derivers  that  can  carry  out  a  given  job, 
to  identify  workers  that  contain  those  tools,  and  then  to  select  an  available  job  server  that 
can  run  one  of  those  workers.  Non-derivers  will  be  discussed  later. 
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Adding  a  new  deriver.  In  the  ultimate  E-L  system,  the  developer  of  a  program  or  a 
document  will  work  entirely  through  a  single  interface  that  has  the  feel  of  an  editor,  but  an 
editor  that  can  interpret  programs  and  typeset  documents.  That’s  why  E-L  avoids  commands 
for  running  compilers,  text  formatters,  and  the  like.  The  user  focuses  on  creating  and 
manipulating  the  structure  of  primary  objects,  like  programs  and  documents.  A  derivation, 
that  is  a  job  whose  tool  is  a  deriver,  e.g.,  for  compilation  or  typesetting,  occurs  in  the 
background,  in  response  to  the  creation  of  new  artifacts,  and  in  anticipation  of  the  user’s 
need  for  derived  information.  Such  processing  is  only  loosely  coupled — via  plexes — to  user 
sessions.  It  is  distributed  on  the  network  to  take  advantage  of  available  computing  power 
and  to  avoid  degrading  session  responsiveness. 

You  may  have  the  idea  that  derivations  are  scheduled  only  when  drafts  are  commit¬ 
ted.  Actually,  commitment  and  scheduling  are  completely  separate  notions;  you  can  commit 
drafts  without  scheduling  any  jobs  and  schedule  deriver  jobs  without  committing  any  drafts. 
In  fact,  new  deriver  jobs  can  also  be  scheduled  as  part  of  carrying  out  existing  jobs.  While 
there  is  only  one  scheduling  algorithm,  there  is  no  central  scheduling  agent.  Scheduling  is  car¬ 
ried  out  by  multiple  processes,  distributed  over  the  network,  cooperating  via  the  scheduling 
plexes  and  using  still  other  plexes  to  guide  decisions. 

An  important  aspect  of  writing  a  deriver  is  making  it  “incremental”,  i.e.,  making  it 
efficiently  use  information  derived  from  similar  artifacts.  This  involves  looking  at  two  related 
artifacts,  i.e.,  two  relatives,  figuring  out  just  what  the  differences  are  and  what  information 
in  the  derivative  of  one  is  of  use  in  constructing  a  derivative  of  the  other.  Don’t  be  daunted 
by  the  apparent  complexity  of  this  situation.  There  is  a  style  common  to  all  derivers,  and  the 
addition  of  a  deriver  actually  consists  of  the  implementation  of  a  handful  of  relatively  simple 
functions,  each  of  which  plays  a  specific  role  in  determining  prerequisites  for  the  derivation, 
for  exstmining  relatives,  and  for  putting  the  pieces  together.  Chapter  4  explains  how  to  create 
a  new  deriver,  incorporate  it  into  a  worker,  add  the  deriver  to  the  derivers  plex,  and  add 
the  worker  to  the  workers  plex. 

Documentation  Conventions.  It  is  common  for  a  single  document  to  describe  the  im¬ 
plementation  of  several  related  artifact  types  and  derivative  classes.  The  first  chapter  of 
such  a  document  should  be  entitled  “Public  View”,  and  should  be  divided  into  two  sections, 
the  first  describing  derivative  classes,  and  the  second,  artifact  types.  In  both  cases,  it  should 
describe  what  other  extenders  need  to  know  about  the  data  structures  in  order  to  write  tools. 
The  first  of  these  sections  is  particularly  important,  because  it  tells  other  extenders  how  to 
produce  output  that  can  be  consumed  by  tools  that  you  are  providing. 

Summary.  The  reason  that  an  E-L  user  can  extend  the  artifacts  system  is  that  much 
of  the  behavior  isn’t  hidden  or  hard-wired.  Instead  it  is  captured  in  plexes  that  users  can 
change  in  controlled  ways.  The  programs  that  implement  the  behavior  of  artifacts  and  drafts, 
including  commands  in  the  user  interface  and  tools  that  derive  and  manipulate  information 
about  artifacts,  are  all  recorded  in  plexes  that  users  can  modify.  This  rest  of  this  guide  is 
about  the  structure  of  those  plexes,  the  nature  of  the  functionality  that  the  extender  can  or 
must  add,  and  the  commands  in  E-L  for  doing  that. 
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2  User  Interface  Extensions  for  Types 

This  chapter  describes  how  to  define  a  new  type  for  artifacts  and  drafts,  and  how  to  extend 
EmaCS  for  artifacts  and  drafts  of  that  type. 

2.1  The  types  Plex 

The  types  plex  lists  all  the  types  in  the  system.  An  entry  in  this  plex  has  the  form: 

•  name  resident-file;  version  auxiliary-file;  version 

In  a  view  on  the  types  plex  with  the  format  extenders,  each  entry  occupies  two  lines; 
otherwise,  lines  are  invariably  too  long.  The  other  possible  format — and  the  default— is 
users  in  which  only  the  names  of  the  types  are  listed,  in  alphabetical  order.  The  order  in 
which  types  appear  is  not  particularly  important.  However,  various  internal  data  structures 
use  the  index  of  a  type^  i.e.,  its  position  in  the  plex,  and  it  is  important  that  this  index  not 
chamge.  If  a  type  other  than  the  last  is  deleted,  you  will  see,  in  the  extender’s  view  on  this 
plex,  a  gap  (single  blank  line),  created  so  that  remaining  types  will  have  unchanged  indices. 
It  is  thus  possible  to  view  the  plex  and  determine  the  index  of  a  type.  When  you  define  a 
new  type,  it  will  occupy  the  first  available  gap,  if  there  is  any. 

The  files  in  the  plex  are  those  from  which  the  functionality  was  installed.  They  are  not 
the  actual  files  used  by  the  system.  Rather,  copies  of  the  files  are  made  and  stored  internally, 
and  the  system  uses  these  internal  copies.  The  version  numbers  indicate  how  many  times  the 
type  has  been  “repaired”,  i.e.,  the  number  of  times  a  new  file  has  been  installed.  A  version 
number  is  not  incremented  if  the  new  file  is  identical  to  the  internal  file. 

At  the  beginning  of  a  session,  E-L  loads  all  the  resident  files  (i.e.,  the  internal  copies), 
in  the  order  in  which  the  types  are  listed.  It  is  important  to  keep  resident  files  as  small  as 
possible;  this  will  be  discussed  in  more  detail  below.  The  auxiliary  files  (i.e.,  the  internal 
copies)  are  loaded  as  needed. 

2.2  Doing  It,  Repairing  It,  Undoing  It 

We  begin  with  the  Emacs  command  that  does  the  actual  installation: 

•  new-type  type  resident-file  auxiliary-file  toptional  initialization-file 

—  The  type  argument  is  the  name  of  the  new  type;  it  may  contain  only  alphanumerics 
and  hyphens. 

—  The  two  files  are  Emacs  lisp  that  specifies  the  functions  associated  with  the  new 
type.  The  auxiliary-file  should  end  with  (provide  ’  types- type) .  These  files  are 
discussed  in  more  detail  in  section  2.4. 

—  The  resident  and  auxiliary  files  will  be  loaded  to  check  for  errors  that  arise  in 
loading;  if  such  errors  arise,  the  installation  of  the  type  will  not  occur.  Otherwise, 
a  new  entry  is  made  in  the  types  plex. 
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-  If  the  initialization-file  is  not  nil,  it  is  loaded  after  the  type  is  installed  and  before 
any  startup  functions  (see  eval-when-startup,  below)  are  called.  Its  purpose  is 
to  initialize  the  type.  When  new-type  is  used  interactively,  a  prefix  argument  will 
cause  this  argument  to  be  requested;  otherwise,  it  will  be  nil. 

—  After  issuing  this  command,  you  will  probably  want  to  change  the  filter  of  your 
views  on  the  artifacts  plex  so  that  you  can  see  artifacts  with  your  new  type. 

If  your  implementation  is  correct  the  first  time,  you  may  skip  the  next  function.  If 
however,  you  discover  bugs  in  the  functions  for  a  type,  you  may  install  the  fixes  with  the 
following  command,  legal  only  when  the  type  already  exists; 

•  repair-type  type  resident-file  auxiliary-file  ftoptional  repair- file 

—  The  arguments  are  the  same  as  to  new-type  (with  the  repair-file  acting  like  the 
initialization- file). 

—  Both  files  will  be  loaded;  if  an  error  occurs,  the  repair  will  not  be  made. 

—  Otherwise,  the  resident-file  and  auxiliary-file  arguments  supersede  those  that  are 
currently  in  the  types  plex,  but  the  version  numbers  are  incremented  only  if  the 
files  are  actually  different. 

—  If  the  repair-file  is  not  nil,  it  is  loaded  after  resident-  and  auxiliary-files  have 
replaced  the  old  values.  Its  purpose  is  to  repair  the  state  of  the  type,  for  example, 
to  uniformly  change  the  bodies  of  artifacts  with  type  type. 

A  type  may  be  deleted  by  the  following  command: 

•  delete-type  type 

This  will  not  succeed  if  there  are  still  artifacts  with  type  type. 

2.3  Representation  of  Artifacts 

To  write  editor  functions  and  other  tools  that  deal  with  the  internal  structure  of  artifacts, 
you  must  understand  how  E-L  stores  artifacts.  This  section  describes  the  storage  technique, 
and  is  thus  a  requirement  for  the  remainder  of  this  chapter,  and  for  the  next  chapter. 

From  the  user’s  point  of  view,  an  artifact  consists  of  a  structure — the  contents  of  the 
artifact — determined  by  the  type  of  the  artifact.  The  contents  may  have  occasional  references 
to  other  artifacts  (see  the  picture  below).  The  contents  may  be  relatively  unstructured,  e.g., 
English  text,  or  highly  structured,  e.g.,  baise  El-L.  We  are  concerned  here  only  with  managing 
the  references  to  other  artifacts,  not  the  details  of  the  contents  (which  are  type-dependent). 

From  the  extender’s  point  of  view,  an  artifact  is  dealt  with  via  an  artifact  pointer  (in  the 
picture  below,  the  pointers  are  p,  pi,P2,  and  ps).  In  both  EmaCS  Lisp  and  Common  Lisp,  an 
artifact  pointer  is  simply  a  short  string.  However,  life  for  the  extender  is  more  complicated 
than  simply  having  artifacts  with  pointers  to  other  artifacts.  A  practical  implementation  of 
E-L’s  updating  philosophy  requires  that  two  artifacts  having  the  same  contents,  and  differing 
only  in  references  to  other  artifacts,  be  storable  in  only  a  little  more  space  than  is  required 
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by  one  of  the  artifacts.  Accordingly,  there  is  a  two-level  storage  of  artifacts,  connected  by  a 
notion  of  an  artifact  handle,  which  is  a  part  of  the  contents  that  identifies  a  reference  relative 
to  the  set  of  references  for  the  artifact.  In  the  following  picture,  the  handles  are  shown  as 
circled  integers. 


While  the  user  has  only  the  idea  of  contents,  an  extender  must  carefully  distinguish  between 
what  is  technically  the  body,  labeled  b  above,  and  the  contents,  i.e.,  whatever  is  inside  the 
circle.  In  the  implementation,  a  body  is  again  a  string.  The  interpretation  of  this  string  is 
totally  type-dependent,  and  is  governed  by  the  following  rules: 

•  The  body  must  not  contain  a  []]],  [t],  or  0. 

•  Pragmatically,  the  implementer  should  ensure  that  the  string  is  reasonably  short.  For 
this  reason,  it  is  often  essentially  a  file  name  (see  section  2.9  for  the  whole  story  here) 
where  the  contents  can  be  found. 

The  rules  governing  artifact  pointers,  bodies,  and  handles  are  summarized  as  follows: 

•  The  contents  of  an  artifact  never  contains  a  pointer  to  another  artifact.  Rather,  at  each 
place  where  the  user  thinks  there  is  a  reference,  there  is  actually  an  artifact  handle. 
Handles  may  be  thought  of  as  being  numbered  consecutively:  1 . .  .k.  However,  the 
representation  of  a  handle  is  entirely  up  to  the  type  designer;  the  artifacts  system 
never  looks  directly  at  an  artifact’s  contents. 

•  The  database  associates  with  an  artifact  pointer  a  body,  a  list  of  references  (i.e.,  a  list 
of  artifact  pointers  corresponding  to  the  handles  in  the  body),  a  set  of  predecessors 
(i.e.,  their  pointers),  and  a  list  of  universal  attributes. 

An  important  consequence  of  these  rules  is  that  even  though  two  artifacts  have  different 
references,  they  can  still  share  a  single  body.  The  only  cost  for  the  second  artifact  is  the  record 
in  the  database  that  aissociates  the  pointer,  body,  references,  predecessors,  and  attributes. 

A  tool  can  query  the  database  for  the  information  associated  with  the  pointer,  and  only 
type-specific  tools  know  about  the  structure  of  contents,  so  only  such  tools  can  associate 
handles  with  artifacts.  The  database  knows  nothing  about  the  representation  of  bodies — it 
treats  them  only  as  strings.  A  tool  is  nearly  as  ignorant  of  artifact  pointers — it  passes  them 
around  but  never  looks  inside  them. 
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2.4  Resident  and  Auxiliary  Files 

Because  the  resident  file  is  loaded  at  the  beginning  of  an  E-L  session,  it  is  useful  to  keep 
it  small,  on  the  theory  that  the  functionality  for  any  given  type  is  likely  not  be  used  in  a 
given  E-L  session.  The  usual  custom  is  to  place  in  the  resident  file  only  user  commands  and 
their  key  bindings.  Sometimes,  even  many  of  the  user  commands  need  not  be  resident.  For 
example,  there  may  be  several  commands  associated  with  a  debugger,  none  of  which  make 
any  sense  until  the  debugger  is  started.  Thus  the  command  to  start  debugging  might  be 
resident,  but  the  other  commands  can  be  in  the  auxiliary  file.  Further,  utilities  defined  in 
one  plex’s  auxiliary  file  and  used  by  another  plex  should  not  be  made  resident,  even  by  an 
autoload.  The  preferred  style  is  to  place  a  call  on  the  following  function  in  the  auxiliary 
file  of  the  plex  that  may  need  it.  It  is  also  used  to  define  commands  which  must  be  visible 
in  the  resident  file. 

•  autoload-type  symbol  type  string  boolean 

—  The  arguments  are  exactly  the  same  as  ordinary  EmacS  Lisp  function  autoload, 
except  that  a  type  rather  than  a  file  is  the  second  argument.  The  full  definition 
is  expected  to  be  in  the  auxiliary  file  for  the  type. 

We  will  also  see  that  E-L  will  look  for  certain  functions  in  the  auxiliary  file  even  without 
autoload-type  (see  section  2.5).  Thus,  a  resident  file  can  actually  be  empty,  although  the 
is  unlikely,  because  a  type  will  probably  have  a  user  command  to  modify  it,  and  autoload- 
type  declarations  for  this  should  be  in  the  resident  file.  Similarly  an  auxiliary  file  can  contain 
only  the  provide,  but  it  is  not  likely  that  a  type  would  have  no  funciionality.  To  repeat  a 
previous  point,  it  is  not  good  form  to  put  the  implementation  for  a  type  in  the  resident  file, 
thereby  making  the  auxiliary  file  minimal. 

In  some  cases,  the  auxiliary  functions  for  one  type  may  use  autoload-type  to  define 
functions  from  another  type.  An  auxiliary  file  may  also  use: 

•  require-type  type 

—  Same  as  the  usual  EmacS  require,  but  takes  a  string  argument  and  implicitly 
specifies  the  auxiliary  file  for  type  as  the  file  to  be  loaded. 

—  It  does  not  make  sense  to  use  this  in  a  resident  file,  because  that  would  have  the 
effect  of  making  the  auxiliary  file  for  type  resident. 

It  is  also  possible  to  have  EMACS-lisp  files  not  associated  with  any  particular  type.  For  this, 
use  one  of: 

•  require-emacs-lisp  file 

—  The  argument  is  a  string  that  specifies  a  file  known  to  E-L  as  the  one  to  be  present. 

—  Such  a  file  should  end  with  (provide  ’file). 

—  This  is  occasionally  used  in  resident  files  to  make  basic  functionality  resident. 
Such  uses  should  be  rare  and  conscious. 
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•  autoload-emacs-lisp  symbol  file  string  boolean 

—  Same  as  autoload,  but  the  file  is  one  that  is  known  to  E-L. 

—  Utilities  are  described  in  the  document  E-L  Emacs  Utilities.  To  determine  the  file 
argument  for  this  command,  see  the  italicized  tag  that  is  near  the  right  margin 
in  the  implementation  of  the  utility. 

To  make  the  loading  of  resident  and  auxiliary  files  idempotent,  any  initialization  of  data 
structures  should  not  be  done  directly  by  top-level  forms  in  the  file,  but  within  “startup 
declarations”,  specified  by; 

•  eval-when-startup  forms 

—  The  forms  are  recorded  rather  than  evaluated  when  loading.  They  are  evaluated 
after  the  load  is  complete,  and  then  only  the  first  time  a  file  is  loaded. 

—  Startup  declarations  are  evaluated  in  the  order  in  which  they  occur  in  the  file. 
(The  resident  files  for  installed  types  are  loaded  in  the  order  in  which  types  are 
installed;  auxiliary  files  may  be  loaded  in  any  order.) 

Finally,  a  note  on  successful  use  of  the  repair-type  capability:  loading  a  resident  or 
auxiliary  file  should  be  idempotent.  Thus,  it  is  acceptable  to  use  def  const  to  define  constants 
(perhaps  even  preferable,  if  those  constants  are  subject  to  being  defined  wrong),  but  def  var 
should  be  used  for  variables  that  hold  session  state.  One  of  the  purposes  of  eval-when- 
startup  is  to  ease  fulfilling  the  idempotency  requirement. 

2.5  Standard  Behavior 

There  is  a  set  of  type-dependent  EmacS  functions  and  variables  that  are  subject  to  cus¬ 
tomization.  An  extender  may  define  none  of  these,  or  any  subset  of  them,  as  long  as  the 
definitions  are  self-consistent.  If  /  is  a  “family”  of  type-dependent  functions  or  variables, 
the  user  customizes  /  for  type  t  merely  by  defining  a  function  or  variable  whose  name  is  f-t, 
e.g.,  (def [un|var|const]  f-t  .  . .).  If  the  extender  makes  no  such  definition,  the  function 
or  variable  standard-/  is  used  instead.  Moreover,  if  this  function  is  not  resident,  the  system 
will  look  in  the  auxiliary  file  for  the  type  t  (see  section  2.2  and  2.4),  so  there  is  no  need  to 
make  these  functions  resident  even  with  an  autoload-type.  To  keep  resident  functionality 
small,  it  is  in  fact  preferable  to  say  nothing  about  f-t  in  the  resident  file  for  t. 

The  functions  standard-/  are  also  available  to  the  extender.  The  first  argument  of 
standard-/  is  a  type  (a  string,  sometimes  ignored),  and  the  subsequent  arguments  are  the 
same  as  f-t.  It  is  common  for  an  implementation  of  f-t  to  call  standard-/.  In  many 
situations,  this  is  recommended: 

•  If  standard-/  is  not  entirely  trivial,  then  it  is  good  practice  for  f-t  to  call  standard- 
/,  and  to  add  its  own  functionality  by  other  actions  before  and/or  after  its  call.  In 
particular,  it  is  recommended  that  you  not  find  the  implementation  of  standard-/  and 
copy  its  pieces  into  your  function,  because  this  makes  it  less  likely  that  your  extension 
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will  pick  up  improvements  made  to  standard  functions  and  less  likely  that  they  will  be 
robust  in  the  face  changes  to  the  implmentation  details  of  standard  functions. 

The  standard  way  to  store  the  contents  of  an  artifact  of  type  type  is  as  a  file  in  the 
directory  repository/ tyj>es/ type,  where  repository  is  the  directory  in  which  data  for  E-L  is 
stored.  The  standard  body  for  an  artifact  of  type  t  is  a,  file  name  relative  to  the  standard 
directory.  The  file  name  is  automatically  generated,  and  is  usually  referred  to  as  a  uid,  for 
“unique  identifier”.  A  uid  is  represented  as  a  string. 

The  contents  of  an  artifact  is  standardly  represented  eis  standard  artifact  text,  which  is 
simply  ordinary  text  with  the  following  convention  for  dealing  with  handles: 

•  The  handle  (1 -based)  is  represented  by  the  character  code  256  —  i. 

This  means  that  there  is  a  limit  of  128  handles  per  artifact  buffer  before  these  characters 
start  to  overlap  with  standard  ASCII.  [This  is  subject  to  future  extension;  for  the  present, 
there  are  pragmatic  reasons  to  avoid  artifacts  with  this  many  references.] 

There  is  in  general  a  difference  between  the  contents  of  an  artifact  as  stored  in  the 
repository  {repository  form),  which  is  what  was  described  above,  and  what  is  reasonable  for 
a  person  to  look  at  {display  form),  in  which  case,  handles  are  represented  in  one  of  two  ways, 
either  by  special  characters  (for  non-Epoch  use),  just  as  in  standard  artifact  text,  or  as  Epoch 
buttons.  Clients  of  this  facility  thus  may  make  no  assumptions  about  the  representation, 
but  must  use  utilities  for  various  operations.  Drafts  often  use  the  display  form  of  standard 
artifact  text. 

If  standard  artifact  text,  both  in  display  and  repository  form,  is  sufficient  for  the  purposes 
of  your  extension,  the  only  definitions  that  you  will  need  to  make  are  those  that  specify  edit 
modes  and  that  inform  the  system  about  derivative  kinds  for  an  artifact  type:’ 

0  art  if  act -edit -mode- tj/pe 

—  E-L  calls  this  with  the  current  buffer  having  the  repository  form  of  an  artifact  of 
the  given  type. 

—  Since  buffers  on  artifacts  are  read  only,  there  is  not  a  lot  for  this  function  to  do, 
but  it  is  often  useful  to  set  the  edit  mode  for  better  navigation  in  the  buffer,  e.g., 
so  that  C-x  C-f ,  etc.,  do  the  right  thing  for  Lisp  artifacts. 

—  standard-artifact-edit-mode  makes  artifact  reference^  interact  with  the  kill¬ 
ring  by  pushing  a  suitable  function  onto  copy-buttons-as-kill-hooks. 

o  derivative-kinds-tppe  — +  kind-&-priority—list 

—  This  must  yield  a  list  of  pairs  consisting  of  the  possible  kind  of  derivatives  (see 
chapter  3)  for  an  artifact  of  the  given  type  having  the  given  body,  together  with 
the  default  priority  for  scheduling  jobs  to  obtain  the  derivatives.  A  priority  is 
number  between  1  and  9,  with  lower  numbers  having  higher  priority,  or  else  is 
nil,  indicating  that  no  job  should  be  scheduled.  The  usual  priority  is  5  or  nil. 

’The  “hollow  bullets”  indicate  functions  that  an  extender  must  write.  Solid  bullets,  if  they  introduce 
functions,  are  used  when  the  function  is  built-in  and  available  for  the  extender. 
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-  All  of  the  possible  kinds  must  occur,  even  if  they  are  never  to  be  derived  oppor¬ 
tunistically,  because  this  list  is  also  used  in  deleting  derivatives  when  an  artifact 
is  deleted. 

—  In  many  but  not  all  cases,  the  result  is  independent  of  the  body. 

—  standard-derivative-kinds  yields  the  null  list. 

If  defined,  the  definitions  of  these  functions,  like  all  functions  an  extender  defines,  go  in  the 
auxiliary  file. 

2.6  Simple  Use  of  Templates 

This  section  requires  knowledge  of  the  public  operations  on  templates,  which  are  discussed 
in  E-L  Emacs  Utilities.  There  is  no  point  in  continuing  until  that  material  is  understood. 

We  saw  in  the  previous  section  that  if  you  are  happy  with  standard  artifact  text,  the 
only  functions  you  need  to  write  for  a  type  extension  are  for  edit  modes.  In  this  section,  we 
provide  a  means  of  extending  EmacS  lisp  functionality  that  is  an  intermediate  point  between 
the  strait-jacket  and  simplicity  of  standard  artifact  text  and  the  flexibility  and  complexity 
of  arbitrary  extensions.  It  covers  situations  in  which  the  difference  between  the  repository 
and  display  forms  for  artifacts  of  the  given  type  can  be  characterized  by  two  templates  that 
describe  the  first  part  of  the  contents  of  the  artifact  in  the  respective  forms: 

•  To  obtain  the  display  form,  extract  a  value  with  the  “repository  template” ,  delete  the 
matched  text,  and  re-insert  the  vadue  with  the  “display  template”.  Set  the  buffer-local 
variable  header-end-point  to  a  marker  just  after  the  inserted  text. 

•  To  obtain  the  repository  form,  do  the  same  thing,  but  exchanging  the  two  templates, 
and  leaving  header-end-point  unaffected. 

For  this  to  work,  it  is  clearly  necessary  that  the  values  cissociated  with  the  two  templates 
have  the  same  structure,  i.e.,  that  whatever  is  extracted  with  one  can  be  inserted  with  the 
other. 

A  simple  example  occurs  in  lisp-source  artifacts.  In  display  form,  a  buffer  begins 
with  Caption:  (in  inverse-video  and  read-only)  and  the  rest  of  the  first  line  is  the  text  of 
the  caption.  In  repositc’'”  form,  the  Caption:  is  replaced  by  a  simple  semicolon,  for  the 
expedient  reason  that  the  ..  j.sp-forms  derivative  of  the  artifact  can  be  shared  with  the  body 
(at  least  when  there  are  no  references,  which  is  the  usual  case). 

To  obtain  this  simple  correspondence  between  display  and  repository  form,  it  is  necessary 
to  ensure  that  the  template  machinery  is  loaded  and  to  declare  that  the  type  in  question  is 
using  templates,  and  that  is  all.  For  the  lisp-source  example,  the  following  declaration  goes 
in  the  auxiliary  file  for  lisp-source  artifact.  (See  below  for  the  syntax-category  caption.) 

(require-emacs-lisp  "templated-headers") 

(use-tenplates  "lisp-source"  ’ (cone at  caption  line)  ’(concat  ";"  line)) 
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The  first  argument  here  is  the  type.  The  other  arguments  are  template  specifiers,  as  discussed 
in  E-L  Emacs  Utilities.  All  that  needs  to  be  understood  here  is  that  templates  provide  a 
way  of  mapping  between  text  in  a  buffer  and  Emacs  lisp  values;  use  of  the  functions  for 
templates  is  not  required.  The  above  example  is  an  instance  of: 

•  use-templates  type  template-specifier  template-specifier 

—  The  second  argument  specifies  the  display  form  for  the  header  of  artifact  contents 
of  the  given  type,  and  the  third  argument  specifies  the  header  for  the  repository 
form. 

-  In  the  absence  of  any  further  declarations  (see  section  2.8),  the  values  associated 
with  the  two  template-specifiers  must  have  the  same  structure,  and  the  part  of 
the  body  not  specified  by  the  template  is  standard  artifact  text. 

For  applications  of  templates  in  artifacts,  the  Emacs  Lisp  utilities  templated-headers 
(example  above)  does  a  (require-emacs-lisp  "templates"),  and  in  addition,  defines  the 
following  template  syntactic  categories  that  are  specific  to  artifacts  and  drafts. 

•  handle — the  syntax  is  the  standard  display  of  an  artifact  reference.  The  data  is  an 
integer,  specifically,  a  1-based  handle  index.  (Do  not  use  this  for  repository  form!  It 
sets  up  the  display  for  a  handle  in  text  form.) 

•  caption — this  syntax  category  is  effectively  (permament  "Caption:  "). 

Uses  of  the  handle  category  are  often  in  conjunction  with  the  machinery  provided  in  the 
next  section,  and  we  postpone  an  example. 

Recall  the  discussion  in  section  2.5  of  the  customization  of  a  family  /  for  a  type  t.  The 
way  that  use-templates  works  is  to  make  a  stylized  definition  of  f~t  for  all  families  /  where 
template- controlled  behavior  is  desirable.  These  definitions  all  have  the  form: 

•  (defun  f-t  (ftrest  args)  (apply  'templated-/  t  args)) 

Any  definitions  of  f-t  that  follow  use-templates  will  override  this  canonical  definition;  it 
is  thus  possible  to  get  the  templated  behavior  as  a  default,  but  do  further  customization 
where  necessary.  Such  further  customization  will  often  use  a  direct  call  on  templated-/  to 
do  most  of  the  work.  The  behavior  of  templated-/  will  be  given  as  part  of  the  descriptions 
of  each  family,  in  later  sections. 

2.7  Bodies  and  Contents 

As  noted  above,  the  contents  of  an  artifact  is  usually  stored  in  a  file  named  by  a  uid.  It 
is  possible  to  make  most  extensions  without  having  to  know  where  these  files  are  stored  or 
how  the  uids  are  obtained.  In  fact,  as  we  saw  in  the  last  section,  it  is  often  possible  to  make 
extensions  without  even  being  aware  of  what  a  uid  is.  In  those  extensions,  the  body  of  the 
artifact  is  exactly  the  uid. 

It  is  quite  common  to  customize  artifact  bodies  of  a  given  type  by  including  in  the  body 
a  bit  more  information  than  just  the  uid.  The  motivation  is  usually  speed  in  finding  out 
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something  about  the  contents  without  having  to  do  a  file  access.  An  example  is  a  body  that 
has  both  a  uid  and  an  integer.  The  standard  values  of  the  following  provide  the  case  in 
which  a  body  is  exactly  a  uid. 

o  uid-of-artifact-body-re-format-tj^e 

—  This  is  a  regular  expression  with  a  “hole”  that  locates  the  uid  within  the  body. 
The  hole  is  indicated  by  y,s.  For  example,  if  the  body  consists  of  a  uid  followed 
by  a  vertical  bar  and  an  integer,  the  string  should  be  */,s  I  [0-9]  +.  (See  section  2.8 
for  aji  example.) 

—  NB;  There  must  not  be  a  \  (  preceding  the  hole. 

—  If  this  value  is  the  empty  string,  then  it  is  assumed  that  the  body  itself  is  the 
entire  contents  of  the  artifact. 

—  standard-uid-of-artif act-body-re-format  has  the  value  Xs,  i.e.,  the  uid  con¬ 
stitutes  the  entire  body. 

o  artifact-body-uid-re-fj/pe 

—  This  is  a  regular  expression  that  must  match  a  uid. 

—  standard-artif  act-body-uid-re  has  the  value  [0-9]+,  i.e.,  a  standard  uid  has 
one  or  more  digits. 

To  be  completely  precise  about  how  these  two  variables  work,  we  describe  how  they  are  used. 
One  issue  is  the  deletion  algorithm,  which  must  cope  with  the  fact  that  many  artifacts  can 
share  the  same  body  (see  section  2.3),  and  thus  the  same  uid,  which  may  not  be  deleted  until 
the  last  reference  to  it  has  disappeared.  Briefly,  after  an  artifact  is  deleted,  the  following 
algorithm  is  applied  to  its  body. 

•  If  uid-of-artifact-body-re-format-<j/pe  is  the  empty  string,  there  is  no  uid,  and 
nothing  else  to  do  (delete-art if  act-uid-tj/pe  is  never  called).  Otherwise,  the  body 
does  have  a  uid. 

•  Denote  artifact-body-uid-re-/ppe  by  uid-re,  let  body-re  be  the  result  of  substitut¬ 
ing  \iuid-re\)  for  the  hole  in  uid-of-artifact-body-re-f ormat-type,  and  let  re  be 
’’body-re$.  Then  re  must  match  the  entire  string  (that  is  the  role  of  “  and  $),  and  the 
uid  is  extracted  as  the  first  match.  This  construct  explains  why  uid-of-artifact- 
body-re-format-fype  must  not  have  a  \(  preceding  the  hole. 

•  Another  artifact  is  said  to  “share  a  uid”  with  the  deleted  artifact  if  its  body  is  exactly 
matched  by  the  regular  expression  obtained  by  substituting  the  uid  for  the  hole  in  uid- 
of-artif  act-body-re-format-type.  A  search  is  made  for  artifacts  with  the  same  type 
and  sharing  the  same  uid.  If  none  are  found,  delete-artif  act-uid-type  is  applied  to 
the  uid. 
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Another  situation  in  which  these  same  variables  are  used  is  in  archiving.  In  this  case,  uid- 
of-artifact-body-re-format-<ype  is  used  to  determine  if  the  artifact  has  a  uid;  if  so,  its 
contents  become  part  of  the  exported  archive,  and  when  importing  the  archive,  a  new  uid 
is  created  using  them,  and  a  new  body  constructed  from  the  non-uid  part  of  the  archived 
artifact’s  body  and  the  new  uid. 


2.8  Advanced  Use  of  Templates 

The  use  of  templates  alone  is  insufficient  to  implement  a  design  in  which  the  body  of  an  ar¬ 
tifact  has  information  that  is  in  the  buffer  in  display  form.  This  section  provides  a  paradigm 
for  such  situations.  It  assumes  knowledge  of  built-in  template  syntax  categories  and  con¬ 
structors  (see  E-L  Emacs  Utilities)  as  well  as  ordinary  Emacs  regular  expressions. 

The  first  step  is  to  define  the  templates.  In  the  example  below,  it  is  clear  that  the  two 
templates  are  not  inverses  of  one  another;  they  don’t  even  contain  the  same  information.  The 
list  of  handles  in  the  first  template  disappear  in  the  second.  In  repository  form,  rather  than 
being  stored  in  the  contents  of  the  uid,  it  will  be  represented  as  an  integer,  which  together 
with  the  uid,  constitutes  the  body.  The  integer,  say  n,  says  that  the  first  n  references  are 
part  of  the  header;  the  remaining  references  are  found  in  the  contents  of  the  uid,  stored  as 
in  standard  artifact  text. 

(require-emacs-lisp  "templated-headers") 

(use-templates  "module" 

'(concat*  caption  line 

(permanent  (divider  "Prerequisites")) 

(list  (concat  handle  "\n")) 

(permanent  (divider  ""))) 

’line) 

The  display  forms  looks  like: 

Caption:  caption 

-  Prerequisites  - 

referencei 

referencek 


The  repository  form  has  only  the  caption;  as  we  shall  see,  the  references  are  stored  differently. 

When  the  repository  and  display  templates  have  associated  values  that  are  structurally 
dissimilar,  it  is  necessary  to  define  three  functions  used  in  converting  between  the  two  forms. 
The  function  that  produces  display  form  gets  as  an  argument  the  body  with  the  uid  stripped 
out,  and  produces  the  values  to  be  inserted  via  the  display  template.  There  are  two  functions 
used  to  produce  repository  form  from  the  display  form  of  a  buffer.  The  first  yields  a  “body 
format”  and  a  list  of  handles.  The  second  produces  the  values  to  be  inserted  via  the  repository 
format.  The  standard  versions  of  these  functions  give  the  behavior  described  in  section  2.6. 
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0  templated-repository-to-display-iype  residual-body  w  —*  v 

—  Called  on  a  buffer  with  the  repository  form  of  an  artifact,  after  standard- 
display-artifact-  contents  has  been  called.  It  must  not  disturb  point. 

—  The  residual-body  is  called  that  because  it  is  the  body  with  its  uid  removed. 
Technically,  it  is  obtained  by  using  the  value  re  defined  in  connection  with  deletion 
(see  the  previous  section)  to  locate  the  uid  within  the  body,  and  to  yield  the 
concatenation  of  the  strings  preceding  and  following  the  uid  substring.  With  a 
stauidard  body  format,  namely  %s,  the  residual-body  is  always  the  empty  string. 

—  The  value  w  was  obtained  by  using  the  repository  template  to  parse  the  buffer, 
starting  at  the  beginning. 

—  The  value  v  will  be  inserted  according  to  the  display  template. 

—  This  function  is  relevant  only  when  use-templates  is  in  effect. 

—  standard-templated-repository-to-display  checks  that  the  residual  body  is 
the  empty  string  (issues  an  error  if  not),  and  yields  w  as  the  value  for  v. 

o  body-f  ormat-&-handles-that-appear-in-header-iype  v  —*  body-format  handle-list 

—  The  value  v  passed  to  this  function  was  extracted  from  the  buffer  using  the  display 
template;  the  point  parameter  w?is  the  end  of  the  match. 

—  The  first  result  is  a  body-format,  a  string  with  a  “hole”  (5is)  in  it  for  the  uid. 
The  body-format  becomes  the  body  after  the  hole  is  replaced  by  the  uid,  unless 
uid-of-artifact-body-re-format-fype  is  the  empty  string,  in  which  case  the 
body-format  is  the  body  itself. 

—  The  last  result  is  a  list  of  handies  actually  appearing  in  the  part  of  the  buffer 
matched  by  the  display  template.  The  references  of  the  eventual  artifact  will 
first  have  these  references  in  the  order  specified,  and  then  those  of  the  rest  of  the 
buffer,  in  order. 

—  The  two  results  must  be  combined  with  cons. 

—  This  function  is  relevant  only  when  use-templates  is  in  effect. 

—  st2uidard-body-f  ormat-ft-handles-that-appear-in-header  yields  a  */,s  as  the 
body  format,  and  a  null  list  of  handles. 

o  templated-display-to-repository-fj/pe  point  v  —*  w 

—  Called  when  the  current  buffer  is  on  a  buffer  with  the  display  form  of  an  artifact, 
or  a  buffer  on  a  draft.  It  must  not  disturb  point. 

—  The  value  v  and  point  are  the  same  as  in  the  above  function. 

—  If  this  function  returns  without  error,  w  will  be  inserted  according  to  the  reposi¬ 
tory  template,  and  if  no  error  occurs,  the  text  matching  the  display  template  will 
be  deleted. 
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-  When  this  function  is  used  in  a  commit,  signaling  an  error  will  abort  the  commit. 

-  This  function  is  relevant  only  when  use-templates  is  in  effect. 

-  standard-templated-display-to-repository  yields  v  as  the  value  for  w. 

Definitions  of  these  functions  for  the  example  of  this  section,  and  the  variables  describing 
the  structure  of  the  body  are  given  below.  (NB:  The  XV,  in  the  argument  to  format  below 
results  in  a  single  X  in  its  result;  the  Xd  is  replaced  by  the  decimal  digits  of  the  length  of  the 
list  of  h2mdles.  Thus,  the  result  is  a  legitimate  body-format.) 

(defun  templated-repository-to-display-module  (residual -body  caption) 

(let  ((n  (string-to-int  (get-match  "l\\([0-9]+\\)"  residual-body))) 

(handle-list  nil)) 

(while  (>  n  0) 

(ppush  n  ’handle-list) 

(deer  ’n)) 

(cons  caption  handle-list))) 

(defun  body-format-Jt-handles-that-appear-in-header-module  (caption-ft-handle-list) 
(let  ((handle-list  (edr  caption-t-handle-list))) 

(cons  (format  "XXsIXd"  (length  handle-list))  handle-list))) 

(defun  templated-display-to-repository-module  (end  caption-ft-handle-list) 

(car  caption-ft-handle-list)) 

(defvar  uid-of-artif act-body-re-format-module  "Xs|[0-9]+") 

See  E-L  EmacS  Utilities  for  get-match,  deer,  and  ppush. 

As  another  example  of  the  use  of  these  functions,  we  show  how  they  are  used  in  the 
implementation  of  lisp-source  artifacts — the  description  in  section  2.6  was  an  oversimpli¬ 
fication.  The  real  story  is  this:  if  the  caption  in  display  form  is  not  blank,  the  caption  is 
repository  form  is  “;1”  followed  by  the  caption  text  supplied  by  the  user.  If  the  caption 
in  display  form  is  blank,  the  caption  in  repository  form  is  “ ;  2”  followed  by  a  caption  that 
is  automatically  filled  in  by  Emacs.  (A  caption  of  the  latter  form  is  not  typeset  with  the 
source  artifact  itself,  only  at  references  to  it.)  So  in  addition  to  the  templates  described 
earlier,  there  are  also  definitions  of  two  of  the  three  possible  functions.  (The  missing  one  is 
not  needed  because  bodies  in  this  case  are  standard.) 

(defun  templated-repository-to-display-lisp-source  (residual-body  caption) 

(if  (*  (aref  caption  0)  ?1) 

(string-cdr  caption) 

■"’)) 

(defun  templated-display-to-repository-lisp-source  (point  caption) 

(if  (string®  caption  "") 

(save-excursion  [code  which  constructs  the  default  caption]) 

(concat  "1"  caption))) 

See  E-L  Emacs  Utilities  for  string-cdr. 
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2.9  Advanced  Emacs  Lisp  Extensions  for  Artifacts 

This  section  has  the  full  story  on  extending  EmacS  functionality  for  an  artifact  type.  It 
is  required  reading  only  when  standard  artifact  text  and  templates  are  not  satisfactory  for 
your  extension. 

2.9.1  Functions  Involving  Uids 

You  can  override  the  default  use  of  the  artifact  repository,  and  put  artifacts  of  a  given  type 
wherever  you  want. 

o  new-axtif  act-uid-typc  —*  uid 

—  Given  no  arguments,  this  function  must  produce  a  string  without  Q,  0,  or  0 
that  is  a  suitable  uid  for  artifacts  of  the  given  type. 

—  standard-new-artifact-uid  uses  the  file  repository/ types/ type/ serial-nxim- 
ber  to  yield  a  uid  (not  a  full  pathname)  for  a  heretofore  non-existent  file  or 
subdirectory  in  the  repository/ types/ type/  directory. 

o  write-artif act-contents-fj/pe  uid 

—  El-L  calls  this  function  with  a  uid  obtained  with  the  previous  function.  It  is  to 
write  the  contents  of  the  current  buffer  to  the  uid. 

—  stzmdard-write-artif act-contents  writes  the  current  buffer  to  the  file  repos¬ 
itory/types/  type/  uid. 

o  read-artifact-contents-fype  uid 

—  E-L  calls  this  function  with  a  uid  to  which  the  previous  function  has  been  applied, 
and  the  following  one  hasn’t.  It  is  to  read  what  was  written  into  the  current  buffer 
after  point. 

—  standard-read- art  if  act-contents  inserts  repository/ types /type/ uid 'mio  the 
current  buffer. 

o  delete-artif act-uid-fype  uid  — »  boolean 

—  FrL  calls  this  function  with  a  uid  which  is  no  longer  referenced.  It  should  scavenge 
the  space  now  used  by  the  uid. 

—  The  result  should  be  nil  if  the  uid  was  already  deleted;  otherwise,  t.  (If  two 
sessions  are  racing  to  delete  artifacts  with  the  last  references  to  the  uid,  it  is 
possible  for  them  both  to  attempt  to  delete  it.  Thus  the  nil  result  is  rare,  but 
possible.) 

—  standard-delete-artifact-uid  deletes  the  file  repository/types/type/uid. 
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2.9.2  Display  Functions 

When  the  user  asks  to  examine  an  artifact,  the  generic  function  calls  read-art  if  act- 
contents-iype  to  initialize  the  buffer  with  the  repository  form  for  the  contents,  then  calls 
the  first  of  the  following  functions  to  produce  the  overall  display  form,  and  finally  calls  the 
second  one  for  each  of  the  handles. 

o  display-artifact-contents-fype  residual-body 

—  The  current  buffer  has  been  initialized  with  the  contents  of  an  artifact  in  repository 
form. 

—  The  residual-body  is  the  same  as  for  templated-repository-to-display-fj/pe. 

—  In  Epoch,  standard-display-artifact-contents  covers  each  meta-character 
256  —  i  with  a  button.  In  Emacs,  the  standard  function  is  a  no-op. 

—  templated-display-artifact-contents  initializes  the  buffer  using  standard- 
display-artifact-contents.  It  then  extracts  w  using  the  repository  template, 
dispatches  to  templated-repository-to-display-fype  to  obtain  the  value  v, 
deletes  the  matched  region,  and  using  the  display  template,  inserts  these  values 
into  the  buffer. 

o  display-artifact-reference-fi/pe  handle  artifact-pointer 

—  E-L  calls  this  with  the  current  buffer  initialized  to  the  repository  form  of  artifact 
text,  modulo  any  changes  made  by  the  previous  function  and  by  previous  calls  on 
this  function.  This  is  first  called  with  the  handle  equal  to  1,  then  2,  etc. 

—  The  first  argument  is  a  handle,  and  the  second  is  the  corresponding  artifact 
pointer.  This  function  must  arrange  for  displaying  occurrences  of  the  handle  in 
the  current  buffer  as  references  to  the  given  artifact.  [A  later  edition  will  describe 
how  to  get  from  an  artifact  pointer  to  a  string  representing  the  artifact.] 

—  staindard-display-artifact-reference  builds  up  a  list  of  strings  for  the  ar¬ 
tifact  references,  storing  it  in  the  buffer  local  variable  special-char-strings, 
anticipating  the  day  when  meta-characters  can  be  displayed  as  arbitrary  strings. 
In  Epoch,  it  replaces  the  text  under  the  1-based  ith  button  with  ith  element  of 
special-char-strings.  In  Emacs,  there  are  no  changes  to  the  contents  of  the 
buffer. 

2.9.3  Other  Extension  Functions 

The  implementation  of  an  artifact  type  is  expected  to  provide  sensitivity  to  references  from 
within  a  buffer  on  an  artifact.  Since  the  generic  mechanism  assumes  nothing  about  the 
internal  representation  of  a  buffer,  it  uses: 


2-14 


Extender’s  Guide  for  Artifacts  and  Drafts 


o  accept-hamdle-from-artifact-buffer-iype  — »  handleU{0} 

—  EI-L  calls  this  function  when  a  command  requires  an  artifact  argument  and  point 
is  in  a  buffer  on  an  artifact  of  the  given  type.  If  point  is  near  a  reference,  for  some 
suitable  definition  of  “near”,  the  result  should  be  the  handle  for  that  reference; 
otherwise,  the  result  should  be  0. 

—  standard-accept-handle-from-artifact-buffer  defines  “near”  to  mean  “on 
the  same  line”.  Preference  is  given  to  the  first  reference  following  point;  otherwise 
the  first  reference  preceding  it.  In  Epoch,  it  is  necessary  to  look  at  the  button-list; 
in  EmaCS,  this  is  done  by  searching  for  meta-characters. 

Finally; 

o  art if act -commands -fype 

—  This  specifies  the  list  of  all  type-specific  artifact  commands,  i.e.,  commands  that 
take  an  artifact  argument  whose  type  is  constrained,  but  is  at  legist  allowed  to 
be  the  given  type.  E-L  uses  this  variable  in  constructing  the  menu  that  appears 
when  mousing  an  artifact. 

—  standard-artifact-commands  is  nil. 


2.10  diffing  Artifacts 

An  extender  may  write  several  functions  to  specialize  the  behavior  of  diff-artif  acts  (and 
related  commands)  for  a  new  type.  Because  of  the  possibility  of  using  such  commands 
on  artifacts  of  differing  types,  all  types  must  interact  with  diffing  via  the  leeist  common 
denominator  of  text  in  a  buffer.  First,  to  obtain  the  text,  the  system  calls: 

o  diff-input-from-artifact-contents-fype  separator  string-list  buffer 

—  The  current  buffer  hais  the  display  form  of  the  contents  of  an  artifact. 

—  The  separator  is  a  string  that  does  not  appear  in  the  buffer,  nor  in  the  buffer  that 
this  artifact  will  be  diffed  against. 

—  The  string-list,  which  we  will  denote  has  an  element  for  each  handle. 

—  The  job  of  this  function  is  to  copy  the  current  buffer  into  buffer  (which  is  initially 
empty),  and  to  replace  each  handle  with  the  string  is,x,  where  x  is  the  separator. 

—  stamdard-diff-input-from-artifact-contents  expects  that  the  current  buf¬ 
fer  will  have  standard  artifact  text  in  display  form. 

When  diffing  artifacts  of  the  same  type,  the  following  function  will  be  called  to  compute  the 
difference.  For  example,  the  extension  for  a  type  may  choose  to  diff  on  a  section  by  section 
basis. 
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0  diff-artifact-iype  buffer  buffer  buffer  — »  delta-list 

—  The  first  two  arguments  have  been  prepared  by  diff-input-from-artifact- 
contents-fype . 

—  The  delta-list  data  structure  is  explained  in  E-L  Eukcs  Utilities. 

—  standard-diff-artifacts  is  diff-buffer-fn;  see  E-L  Em  ACS  Utilities  for  de¬ 
tails. 

When  diffing  artifacts  of  different  types,  the  system  calls  stamdard-diff-artif acts. 

To  display  references  in  the  diff  result,  one  of  the  following  two  functions  is  called. 

o  insert-airtif  act-ref  erence-fypc  artifact  {nil,  old,  new} 

—  Insert  the  display  form  of  an  artifact  reference  at  point,  highlighted  according  to 
the  second  argument. 

—  standard-insert-artifact-reference  uses  white  on  blue  for  nil  (just  like 
standard  display-artifact-references),  red  on  blue  for  old,  and  green  on 
blue  for  new. 

0  insert-two-artifact-references-<ype  artifact  artifact  —*  delta 

—  A  single  xsx  corresponds  to  the  given  artifacts  in  the  respective  arguments  to  a 
diff  function.  This  function  should  display  both,  in  the  order  given,  as  if  there 
were  separate  calls  to  the  above  functions  with  a  second  argument  of  old  and 
new. 

—  The  only  reason  that  this  is  a  separate  function  is  so  that  the  extender  can  choose 
how  to  embed  the  difference  references.  This  should  be  reflected  by  the  result. 

—  standard-insert-two-artifact-references  replicates  the  line  in  which  the 
reference  occurs,  remembering  where  point  is  in  each.  The  delta  specifies  the  first 
and  second  line  as  the  red-green  regions.  Other  artifact  references  in  the  line  then 
result  in  calls  on  the  previous  functions. 

2.11  Drafts 

When  you  define  a  type,  you  must  decide  whether  you  wish  to  allow  drafts  of  that  type. 
If  so,  you  have  the  option  of  customizing  the  standard  operations  for  drafts.  This  section 
describes  how  to  do  so.  Note  that  for  drafts  a  body  is  interpreted  by  the  system;  it  must  be 
a  legitimate  file  name  with  a  fully  specified  path,  i.e.,  what  we  will  call  a  file.  All  functions 
below  must  be  defined  in,  or  are  utilities  written  in,  Emacs  Lisp. 

Extenders  of  functions  that  deal  with  references  in  drafts  must  be  aware  of  the  fact  that 
drafts  have  handles  just  like  <irtifacts,  but  that  each  handle  is  associated  not  with  a  single 
artifact  (or  draft),  but  with  a  set  of  what  are  called  subreferences,  each  subreference  referring 
to  either  an  artifact  or  a  draft,  distinguished  by  a  +  or  -  somewhere  in  the  structure  of  the 
subreference. 
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Extenders  must  also  interact  with  E-L’s  support  for  red-green  regions  in  drafts.  Much  of 
the  functionality  for  this  is  type  independent,  for  example,  the  way  in  which  red-green  regions 
are  highlighted  or  the  fact  that  it  is  an  error  to  commit  a  draft  with  red-green  regions.  There 
is  however,  an  opportunity  to  define  functions  like  those  that  are  used  in  diffing  artifacts. 

The  representation  of  a  draft  in  a  buffer  is  usually  very  close  to  that  of  the  display  form 
of  an  artifact  of  the  same  type.  There  is  no  notion  of  “repository  form”  for  drafts,  although 
drafts  do  have  an  autosave  form,  as  will  be  seen  below. 

o  new-draft -body- type  — »  file 

—  Must  produce  a  file  name  (including  path)  for  drafts  of  the  given  type.  This  file 
will  become  the  auto-save  file  for  the  buffer. 

—  This  function  is  allowed  to  change  the  contents  of  the  current-buffer,  for  ex¬ 
ample,  to  initialize  the  draft. 

—  st2mdard-new-draf  t-body  produces  the  string  repositoryitypesf  type/ uid,  with 
uid  chosen  so  that  the  string  is  not  currently  a  file  or  directory. 

o  create-draft-fi/pc 

—  Ei-L  calls  this  with  no  arguments,  and  expects  it  to  yield  no  results.  However,  the 
current  buffer  is  already  set  up,  and  if  desired,  this  function  may  initialize  it  in 
some  useful  way  (but  see  below  if  you  want  artifact  references  in  the  initial  text). 

—  This  is  part  of  the  action  taken  by  user  commands  that  create  a  draft  without 
predecessors,  e.g.,  creat e-draft. 

—  templated-create-draft  inserts  the  display  template,  supplying  no  values,  and 
thus  inserting  the  default  values. 

—  standard-create-drerft  does  nothing. 

o  draft-contents-from-axtif act-contents-fr/pe  residual-body 

—  When  Ei-L  calls  this,  the  current  buffer  has  been  initialized  with  the  contents  of 
an  artifact  in  repository  form.  Its  job  is  to  make  the  buffer  editable. 

—  The  residual-body  is  the  same  as  for  templated-repository-to-display-fype. 

—  This  is  part  of  the  action  taken  by  user  commands  that  create  a  draft  with  pre¬ 
decessors,  e.g.,  edit-artifact. 

—  standard-draft-contents-from-artif act-contents  dispatches  to  the  family 
member  display-artifact-contents-fype. 

o  display-draft-reference- type  handle  draft-reference 

—  E-L  calls  this  in  several  circumstances.  The  first  is  in  initializing  a  buffer  on 
a  draft;  after  calling  draft-contents-from-artifact-contents-<ype,  E-L  calls 
display-draft-reference-fypeonce  for  each  handle,  in  order  of  increcising  han- 
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dies.  The  second  circumstance  is  when  a  reference  has  been  added  to  a  draft 
buffer.  In  both  these  cases,  the  handle  will  be  one  larger  than  the  previous  largest 
handle,  or  1,  if  there  were  no  previous  handles.  The  final  circumstance  is  when 
the  reference  for  a  handle  changes,  in  which  case  the  handle  will  already  have 
been  seen. 

—  The  second  argument  is  a  list  of  subreferences,  each  of  which  is  of  the  form 
("+"  .  artifact-display-list),  or  .  d  draft- display-list) .  [Display  lists  for 

artifacts  and  drafts  will  be  discussed  in  a  later  edition.] 

—  This  function  should  arrange  for  proper  display  of  all  the  occurrences  of  the  han¬ 
dle  in  the  current  buffer,  according  to  the  information  provided  by  the  second 
argument. 

—  In  Epoch,  standard-display-drzrft-ref  erence  obtains  the  text  for  the  handle 
and  displays  it  in  a  button.  Otherwise,  it  updates  a  list  of  “special-char”  strings, 
which  we  hope  that  one  day  will  be  used  in  the  display  of  meta-characters. 

o  draft-edit-mode-fype 

—  This  is  the  last  chance  to  do  anything  in  a  type-dependent  way  before  the  user 
starts  editing.  It  is  common  for  this  function  to  establish  editing  modes. 

—  standard-draft-edit-mode  calls  artifact-edit-mode-typc,  and  (for  Epoch) 
makes  buttons  deletable  as  long  cis  the  deleted  region  spans  the  entire  button  and 
makes  references  interact  with  the  kill-ring  by  pushing  suitable  functions  onto 
copy-buttons-as-kill-hooks  and  yank-hooks. 

0  insert-handle-fj/pe  handle  {nil,  old,  new} 

—  Insert  the  handle  at  point  in  its  display  representation,  highlighted  according  to 
the  second  argument. 

—  In  Epoch,  standard- insert-handle  sets  up  a  button  with  appropriate  colors,  as 
in  insert-artifact-reference-;  in  EmacS,  it  simply  inserts  a  meta-character. 

o  accept-handle-from-draft-buffer-fi/pe  — +  handle  U  {0} 

—  Ei-L  calls  this  function  when  a  command  requires  an  artifact  or  draft  argument 
and  point  is  in  a  draft  buffer  of  the  given  type.  If  point  is  near  a  reference,  for 
some  suitable  definition  of  “nerir”,  the  result  should  be  a  handle  for  that  reference; 
otherwise  the  result  should  be  0. 

—  standard-accept-hauidle-from-draft-buffer  interprets  “near”  to  mean  “on 
the  same  line”.  Preference  is  given  to  the  first  reference  following  point;  otherwise, 
the  first  reference  preceding  it. 
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o  body-f ormat-ft-handles-that-appear-in-drarft-type  — »  body-format  handle-list 

—  Called  when  the  current  buffer  is  on  a  draft. 

-  The  first  result  is  a  body-format,  which  is  a  string  with  a  “hole”  (y,s)  in  it  for 
the  uid,  unless  uid-of-artifact-body-re-format-type  is  the  empty  string,  in 
which  case,  it  is  the  body  itself. 

—  The  second  result  is  a  list  of  handles  actuidly  appearing  in  the  buffer.  The  ;th 
element  of  this  result  will  correspond  to  the  tth  reference  of  an  eventual  artifact. 

—  These  results  are  represented  as  a  cons. 

—  St andard-body-format-A-handles-that-appear- in-draft  yields  y.s  as  the  bo¬ 
dy  format,  and  calls  the  following  function  to  obtain  the  handle-list. 

—  The  function  standard-handles-that-appear-in-draft  takes  no  arguments, 
and  yields  a  handle  list  based  on  the  current  buffer.  In  Epoch,  it  looks  for  buttons 
that  correspond  to  handles;  in  Emacs,  it  finds  meta-characters.  In  both  cases, 
the  order  of  handles  in  the  result  is  that  in  which  they  first  appear  in  the  buffer, 
with  duplicates  removed. 

—  templated-body-format-4-handles-that-appear-in-draf ttype  obtains  v  via 
the  display  template.  It  stashes  the  the  end  of  the  header  (as  a  marker)  and  i> 
on  a  buffer-local  variable,  and  calls  body-f oroat-ft-hauidles-that-appear-in- 
header-fypc  on  v  to  obtain  the  body  format  and  handles  in  the  header.  It  yields 
as  results  the  body  format  and  the  concatenation  of  the  handle  list  from  the 
header  and  the  handles  in  the  rest  of  the  buffer,  obtained  with  standard-body- 
format-ft-handles-that-appear- in-draft. 

o  artifact-from-draft-lock-ft-mode-list-fj/pe  — ♦  lock- & -mode-list 

—  It  is  very  rare  to  need  to  extend  this  function.  If  you  do,  it  is  probably  because 
of  the  need  to  examine  plexes  when  committing  an  artifact. 

—  To  understand  locking  in  general,  and  how  to  create  a  lock- & -mode-list  in  partic¬ 
ular,  see  Extender’s  Guide  for  the  E-L  System. 

—  standard-artifact-contents-from-draft-contents-lock-list  yields  a  list 
of  one  element,  used  to  obtain  a  new  serial  number  for  artifact  bodies  of  the 
given  type. 

o  artifact-uid-from-draft-contents-fype  handle-list  ftrest  goods-list  — >  uid 

—  When  E-L  calls  this,  the  current  buffer  is  a  buffer  on  a  draft. 

—  It  must  transform  the  contents  of  the  draft  to  the  repository  form  of  the  contents 
for  an  artifact  that  is  about  to  be  committed.  Then  it  must  store  the  contents  in 
the  repository  and  return  a  uid  for  incorporation  in  the  new  artifact  body. 

-  If  draft-edit-mode-type  set  up  any  buffer-local  variables  that  might  do  future 
harm  (when  the  buffer  becomes  a  buffer  on  an  artifact),  this  function  should  kill 
them. 
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—  The  handle-list,  call  it  {hi)i,  is  the  second  result  of  body-format-ft-handles- 
that-appear- in-draft- type. 

—  The  goods-list  has  one  element  for  each  element  in  the  result  of  the  previous 
function.  If  you  understand  locking,  you  will  know  what  this  is;  otherwise,  think 
of  it  as  a  magic  cookie  which  will  eventually  be  given  to  the  standard  version  of 
this  function  (see  below). 

—  This  function  must  modify  the  current  buffer  so  that  the  draft  handle  hi  becomes 
artifact  handle  i. 

—  If  desired,  this  function  can  refuse  permission  for  the  commit,  simply  by  using 
error.  In  this  case,  it  should  leave  the  contents  of  the  buffer  unchanged. 

—  stamdard-artifact-uid-from-draft-contents  calls  two  functions  discussed  in 
the  items  immediately  following  this  one:  scauidard-artifact-contents-from- 
draft -contents  and  standard-neu-artif  act-uid-from-contents. 

—  standard-artif act-contents-from-draft-contents  takes  two  arguments,  the 
type  (ignored)  and  the  list  of  handles.  It  replaces  references  in  a  draft  with  meta¬ 
characters  for  the  new  artifact  handles,  and  kills  the  buffer-local  variables  set  up  by 
standard-draft-edit-mode:  copy-buttons-as-kill-hooks  and  yank-hooks. 

—  standard-new-artif act-uid-from-contents  takes  two  required  arguments,  the 
type  and  the  magic  cookie  corresponding  to  the  one  element  in  the  result  of 
standard-artifact-contents-from-draft-contents-lock-list,  plus  an  op¬ 
tional  extension  (for  use  when  a  body  must  have  one).  It  uses  its  magic  cookie 
argument  to  obtain  a  new  uid  and  writes  the  buffer  to  the  file  obtained  from  that 
uid  (possibly  incorporating  the  extension). 

—  templated-artif act-contents-from-draft-contents  takes  the  same  two  ar¬ 
guments  as  the  standard  function.  It  uses  the  data  stashed  by  templated-body- 
f  ormat-ft-handles-that-appear-draft  to  delete  the  header  (having  first  deleted 
any  buttons  there),  canonicaJize  the  handles  not  in  the  header,  insert  xr  using  the 
repository  template,  obtain  a  new  uid,  and  write  the  contents  to  the  file. 

0  merge-art  if  act-into-draft-type  body  diff-flag 

—  When  called,  the  current  buffer  is  on  the  draft  in  question. 

—  The  body  of  the  artifact  of  the  given  type  is  body. 

—  If  diff-flag  is  true,  there  is  supposed  to  be  a  diff  representation.  Otherwise,  the 
contents  of  body  should  be  appended  to  the  draft,  for  some  suitable  definition  of 
append. 

—  standard-merge-artif act-into-draft  does  a  read-artif act-contents-lype 
at  the  end  of  the  buffer,  narrows  to  the  region  just  read,  does  a  draft-contents- 
f  rom-artifact-contents-type,  widens,  and  places  the  cursor  just  before  the  new 
material. 
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—  templated-merge-artif  act-into-draf  t  inserts  the  contents  of  the  artifact  with 
standard-merge-artif act-into-draft,  extracts  the  header  v  using  display- 
template- deletes  the  header,  and  calls  merge-header-into-draf t-<j/pe 
on  V. 

—  stamdard-merge-header-into-draft  inserts  the  print  form  of  v  in  place  of  the 
header.  We  are  also  considering  having  this  function  signal  an  error. 

0  auto-save-draft-Zj/pc  buffer  buffer 

—  E-L  calls  this  function  when  autosaving  a  buffer  on  a  draft  and  when  abandoning 
a  draft.  The  first  buffer  is  the  buffer  on  the  draft;  the  second  is  a  temporary 
buffer  which  is  to  be  filled  with  the  data  which  E-L  will  write  to  the  auto-save 
file.  The  form  of  the  contents  of  the  second  buffer  is  the  operational  definition  of 
the  “auto-save  form”  of  a  draft  of  this  type. 

—  The  purpose  of  this  function  is  to  allow  the  extender  to  auto-save  the  values  of 
buffer-local  variables.  It  should  not  modify  the  contents  of  the  first  buffer,  because 
such  changes  would  become  part  of  the  undo  information  for  the  buffer. 

—  standard- auto- save-draft  appends  to  the  temporary  buffer  the  value  of  point 
and  mark,  a  description  of  each  Epoch  button  in  the  draft,  and  the  contents  of 
the  draft  buffer.  In  order  for  buttons  to  be  recoverable  during  takeover-draft  or 
w’hen  revert-buff  er  is  applied  to  a  draft,  their  button-data  must  be  such  that 
it  gives  an  equal  value  after  being  printed  and  read  back,  where  equal,  print, 
and  read  are  the  EmacS  Lisp  primitives.  Note  that  functions  for  non-standard 
cases  may  insert  whatever  information  they  need  in  the  second  buffer  and  then 
call  standard-auto-save-draft  to  do  the  rest. 

0  revert -draft -<ype 

—  This  function  is  a  companion  to  the  above,  and  is  used  when  recovering  or  taking 
over  a  draft.  The  current  buffer  has  been  initialized  to  the  auto-saved  data,  and 
is  to  be  restored  to  the  state  at  the  time  of  auto-saving  or  abandonment. 

—  stamdard-revert-drecft  sets  up  point  and  mark  from  the  contents  of  the  buffer 
and  deletes  this  information;  it  extracts  and  deletes  button  descriptions,  and 
then  recreates  the  buttons;  and  it  updates  artifact-  or  draft-reference  buttons  to 
reflect  changes  that  may  have  taken  place  since  the  draft  was  auto-saved.  Buttons 
are  re-created  with  the  aid  of  the  buffer-local  list  revert -butt on-hooks.  Each 
member  of  that  list  is  a  function  that  takes  a  button-data  value  and  yields  a 
button  attribute  value  or  else  nil  (meaning  “try  the  next  function”).  The  default 
initialization  of  revert-button-hooks  contains  a  single  function,  called  button- 
attribute-if-reference,  that  deals  with  artifact-  and  draft-reference  buttons. 
Other  functions  can  be  added  to  the  list  by  draft-edit-mode- /.j/pe. 

—  templated-revert-draft  first  calls  steuadard-revert -draft.  It  then  tries  to 
extract  the  header  with  the  display  template,  and  if  it  succeeds,  it  re-inserts  the 
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value  with  the  same  template.  Observe  that  autosave  cannot  save  the  draft  in 
repository  form,  because  there  is  no  guarantee  that  the  header  has  correct  syntax 
in  the  middle  of  am  edit  session.  The  extraction  and  re-insertion  techniques  works 
most  of  the  time,  and  has  the  side-effect  of  nicely  adapting  dividers  to  new  window 
sizes.  Even  when  it  doesn’t  work,  no  harm  is  done,  because  the  user  can  fix  the 
header  and  proceed. 

Suppose,  for  exaimple,  that  you  want  auto-saving  to  record  the  value  of  a  string-valued 
buffer-local  variable  this-syntax,  for  type  source.  This  would  be  done  using  the  standard 
versions  of  these  functions,  as  follows: 

(defun  auto-save-draft-source  (draft-buffer  temp-buffer) 

(set-buffer  draft-buffer) 

(let  ((syntax  this-s3nitax)) 

(set-buffer  temp-buffer) 

(insert  string)) 

(standard-auto-save-draft  "source"  draft-buffer  temp-buffer)) 

(defun  revert-draft-sourceO 
(goto-char  (point -min)) 

(setq  this-syntax  (get-and-delete-line)) 

(standard-revert-draft  "source")) 

See  E~L  Etnacs  Utilities  for  get-and-delete-line. 

2.12  Examining  error  Derivatives 

You  will  not  need  to  read  this  section  unless  you  have  written  a  deriver  that  produces  a 
derivative  of  class  error.  If  you  do,  you  may  want  to  arrange  for  special  display  of  such 
derivatives,  which  is  controlled  by  the  type  of  the  artifact  to  which  they  are  attached.  As  we 
shall  see  in  section  3.5,  error  derivatives  consist  of  a  list  of  files  (perhaps  null)  and  a  string 
(perhaps  null  as  well).  If  both  the  list  and  the  string  are  null,  there  is  no  error.  The  function 
to  extend  is: 

o  examine-error-derivative- type  artifact-cell  descriptor  file-list  string 

—  This  function  is  called  with  an  empty  current  buffer,  which  is  to  be  filled  with  the 
display  of  an  error  derivative. 

—  In  most  cases,  the  artifact-cell  argument  is  ignored.  We  will  discuss  one  interesting 
use  of  this  argument  below. 

—  The  descriptor  is  a  string  (see  the  first  paragraph  of  chapter  3  for  an  explanation). 
The  descriptor  may  say  more  about  how  to  interpret  the  contents  of  the  elements 
of  file-list  or  how  to  interpret  the  string.  It  also  says  where  files  are  (see  the  next 
item) 

—  The  file-list  is  a  list  of  strings.  See  below  for  how  to  use  this  argument. 
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-  The  string  is  the  part  of  the  error  derivative  that  does  not  describe  the  list  of 
files. 

There  is  only  one  reasonable  use  for  the  file-list  argument:  call  the  following  function  with 
an  element  of  the  file-list  as  the  second  argument  (and  the  descriptor  as  the  first  argument). 

•  insert-error-derivative-contents  descriptor  file 

—  Insert  the  file  indicated  by  the  arguments  into  the  current  buffer  after  point. 

The  reason  for  the  artifact-cell  argument  is  to  support  jumping  (via  follow-link,  see 
the  E-L  User’s  Manual)  from  a  buffer  on  the  error  buffer  to  a  buffer  on  the  artifact.  To 
provide  this  kind  of  functionality,  examine-error-derivative- <ype  sets  up  a  buffer-local 
variable  that  holds  the  artifact- cell,  and  defines  the  variable  follow-link-hook  to  point  to 
a  function  which  calls: 

e  (examine-artifact  (art  if  act -cell-pointer  artifact-cell)) 

This  causes  the  buffer  on  the  artifact  to  be  the  current  buffer,  and  it  is  customary  after 
calling  the  above  function  to  set  point  close  to  the  error,  based  on  information  near  point  in 
the  buffer  on  the  error  derivative.  Later  versions  of  this  document  will  explain  how  to  use 
parts  of  the  artifact  cell  other  than  the  pointer. 
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3  Derivative  Class  Extensions 

To  integrate  a  new  tool  into  El-L,  it  is  necessary  to  understand  derivatives  that  the  tool  uses 
or  produces.  As  discussed  in  the  E-L  Users’  Manual,  a  derivative  has  a  particular  kind.  Each 
kind  belongs  to  a  derivative  class.  In  some  cases,  a  derivative  class  can  have  only  one  kind, 
namely  itself,  e.g.,  manuscript.  In  other  cases,  the  class  is  always  a  “stem”  of  the  kind, 
in  which  case  a  “X”  separates  the  class  and  a  species,  e.g.,  objectXsunS — it  does  not  make 
sense  to  have  an  object  derivative  without  saying  what  its  machine  architecture  is.  In  yet 
other  other  cases,  a  derivative  class  may  itself  be  a  kind,  and  may  also  serve  as  the  stem  of 
various  kinds. 

If  you  are  writing  a  tool  that  produces  only  derivatives  of  existing  derivative  classes: 

•  Read  only  the  first  section  below,  to  familiarize  yourself  with  the  gener£il  issues. 

•  Read  the  description  of  the  class  you  will  need  (in  whatever  document  describes  that 
particular  class) 

•  Proceed  to  the  next  chapter. 

If  you  need  to  define  a  new  derivative  class,  this  entire  chapter  is  relevant. 

3.1  Equivalence  of  Derivatives 

A  fundamental  premise  of  the  artifacts  system  is  that  a  derivative  depends  only  on  the 
contents  of  an  artifact,  the  contents  of  its  references,  and  so  on,  recursively.  An  equivalent 
formulation  is  that  a  derivative  of  an  artifact  depends  only  on  its  contents  and  on  derivatives 
of  references;  this  is  closer  to  the  way  that  derivatives  are  actually  computed,  as  we  shall 
see  in  the  next  chapter.  The  practical  import  of  this  constraint  on  derivatives  is  that  if  a 
derivative  is  deleted  and  rederived,  the  new  derivative  is  the  same  as  the  old.  This  raises  the 
question  of  what  it  means  to  be  the  “same”.  For  example,  a  derivative  of  a  given  kind  may 
be  recorded  as  a  file  in  a  certain  directory,  and  tools  manipulating  that  kind  of  derivative 
will  be  given  the  file  name  as  “the  derivative”.  In  such  a  situation,  the  meaning  of  “same” 
is  surely  not  that  the  file  name  be  identical — that  would  subvert  the  entire  purpose  of  being 
able  to  delete  the  derivative  and  to  rederive  it.  More  likely,  derivatives  of  this  kind  are  the 
same  if  the  contents  of  the  files  are  the  same,  but  this  is  only  one  instance  of  a  definition  of 
“same”. 

When  we  axe  being  completely  precise,  we  will  use  the  term  derivative  string.  As  the 
name  suggests,  this  is  a  short  ASCII  string,  such  eis  a  file  name  or  a  file  name  and  the  decimal 
representation  of  an  integer.  A  derivative  string  must  not  contain  a  newline  character.  We 
said  earlier  that  a  deriver  produces  a  derivative,  but  what  a  deriver  actually  produces  is  a 
derivative  string.  To  see  this  string,  use  a  prefix  argument  (C-u)  with  examine-derivative 
(C-z  M-x).  For  every  derivative  class,  there  must  be  a  notion  (but  not  an  implementation)  of 
equivalence  for  derivative  strings  for  that  class.  The  notion  of  a  “derivative”  is  then  abstract: 
it  is  a  set  of  strings  that  are  equivalent  according  to  the  equivalence  relation  for  the  derivative 
class  in  question.  Then  the  rule  that  a  deriver  must  produce  the  “same  derivative”  really 
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means  that  any  two  derivative  strings  that  it  produces  must  be  equivalent.  There  is  a  further 
rule  that  the  functions  that,  informally  speaking,  take  a  derivative  as  an  argument,  but  in 
reality  take  derivative  strings  as  arguments,  must  behave  the  same  on  equivalent  derivative 
strings.  There  are  two  ways  in  which  derivatives  are  used: 

•  By  functions  that  expose  the  derivative  to  a  user,  e.g.,  examine-derivative,  export- 
derivative,  as  well  as  kind-specific  commands,  e.g.  for  executing  or  printing.  In  this 
case,  what  the  user  sees  must  depend  only  upon  the  equivalence  class  of  the  derivative. 

e  Uses  of  derivatives  by  other  derivers.  In  this  case,  the  requirement  is  that  on  equivalent 
inputs,  the  deriver  produces  equivalent  outputs. 

The  second  step  is  a  recursive  condition,  reflecting  the  fact  that  derivatives  are  constructed 
recursively.  The  ultimate  condition  is  the  first  condition — the  user  cannot  tell  the  difference 
between  equivalent  derivative  strings. 

The  reason  for  this  seemingly  technical  discussion  of  the  equivalence  of  derivatives  is 
that  on  the  one  hand,  the  reproducibility  of  derivatives  is  central  to  the  simplicity  of  the 
user’s  mental  model  of  the  artifacts  system,  and  on  the  other,  the  ability  to  rapidly  obtain 
derivatives  incrementally  (i.e.,  from  derivatives  of  relatives)  is  central  to  the  usability  of 
the  system.  The  fact  that  generic  operations  on  derivatives  like  examine-derivative  and 
export -derivative  are  tailorable  by  the  extender  of  derivative  classes  allows  wide  variance 
between  the  user’s  view  of  the  derivative  and  the  data  structure  that  tools  see.  To  illustrate 
this  point,  we  shall  consider  a  derivative  class  c  in  which  the  identity  of  the  derivative  is  that 
of  a  string  of  characters,  perhaps  quite  long.  Suppose  the  derivative  is  typically  created  as 
the  contents  of  an  artifact  of  type  t,  but  where  such  an  artifact  has  references,  the  c  derivative 
of  the  reference  is  spliced  in.  This  is  not  a  toy  example;  the  c-text  derivative  of  c-source 
artifacts  works  in  exactly  this  way.  The  naive  implementation  is  simply  to  store  the  string 
of  characters  in  a  file,  and  to  represent  the  derivatives  by  a  uid  (unique  identifier),  e.g.,  a 
generated  name  of  a  file  in  a  directory  dependent  upon  the  kind.  Then  viewing  of  derivative 
with  class  c  just  puts  the  string  of  characters  in  a  buffer,  and  export  copies  the  string  to  a 
file;  the  above  definition  for  equivalence  as  “having  the  same  file  contents”  is  correct. 

There  is  a  system-wide  convention  for  bad  derivatives — they  are  represented  by  a  string 
whose  first  character  is  ASCII  2,  i.e.,  C-b,  for  “bad”.  In  documentation,  a  bad  derivative 
is  denoted,  for  example:  “BParse  errors.  A  deriver  can  generate  a  bad  derivative  as  the 
derivative  for  any  kind,  as  discussed  in  Section  4.5.1.  The  job  server  can  also  generate  bad 
derivatives,  as  discussed  in  Section  1.5.  A  deriver  never  sees  a  bad  derivative  as  input,  and 
there  is  no  need  to  define  equivalence  on  them. 

If  a  derivative  is  bad,  then  every  derivative  that  depends  on  it  will  also  be  bad.  Because 
of  this  cascading  property,  it  is  preferable  for  a  deriver  not  to  use  bad  derivatives,  but  to 
produce  a  good  derivative  if  at  edl  possible.  The  standard  procedure  for  indicating  problems 
with  derivation  by  means  of  derivatives  is  discussed  in  detail  in  Section  3.5. 

3.2  The  derivative-classes  Plex 

The  structure  of  the  derivative-classes  plex  is  very  similar  to  that  of  the  types  plex 
(chapter  2);  the  only  difference  is  that  each  line  has  an  additional  field  that  describes  the 
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representation  of  derivatives  of  that  class.  Just  to  make  it  official,  a  line  in  this  plex  has  the 
form: 

•  derivative-class  resident-file  auxiliary-file  representation 

In  a  view  on  the  plex,  the  last  three  fields  are  laid  out  vertically,  and  the  representation  field 
may  extend  over  several  lines.  This  latter  field  will  be  discussed  in  detail  in  section  3.3. 

The  commands  to  modify  the  derivative-classes  plex  are  described  only  by  their  head¬ 
ers;  for  a  full  explanation,  modulo  the  representation  field,  see  the  corresponding  command 
for  the  types  plex  in  Chapter  2. 

•  nev-derivative-class 

derivative-class  resident-file  auxiliary-file  representation  ^optional  initialization-file 

—  Obtaining  the  representation  argument  interactively  sets  up  a  recursive  edit  of 
the  plex  view,  with  the  new  entry  for  the  derivative  class  and  the  cursor  in  the 
representation  field.  Highlighting  sets  off  the  editable  region,  which  is  terminated 
by  a  row  of  dashes.  (Type  C-1  if  the  highlighting  seems  wrong.)  Conclude  the 
recursive  edit  with  C-M-c,  or  abort  the  edit  and  the  command  with  C-M~] . 

•  repair-derivatives-class 

derivative-class  resident-file  auxiliary-file  representation  k  optional  repair-file 

—  Obtaining  representation  argument  interactively  sets  up  a  recursive  edit  of  its 
previous  value.  (The  field  is  terminated  by  dashes  only  if  it  is  the  last  derivative 
claiss  in  the  buffer.)  Terminate  the  edit  as  for  the  previous  command. 

•  delete-derivative-class  derivative-class 
Also  analogous  to  types  are  the  following: 

•  autoload-derivative-class  symbol  derivative- class  string  boolean 

•  require-derivative-class  derivative-class 

There  is  now  only  one  Emacs  Lisp  function  family  that  needs  extension  by  derivative- 
class: 

o  examine-derivative- den'rafive-c/ass  derivative-string 

—  This  function  is  called  only  when  the  derivative  is  not  totally  bad,  with  the  argu¬ 
ment  being  the  good  part  of  the  derivative. 

—  Called  with  with  the  current  buffer  to  be  filled  with  a  display  of  the  given 
derivative-string.  The  only  data  in  the  buffer  is  the  bad  part  of  the  derivative,  if 
any. 

—  stamdard-examine-derivative  checks  whether  the  derivative  class  structure  is 
a  simple  file;  if  so,  it  inserts  the  file,  and  if  not,  it  simply  inserts  the  string. 
[This  should  be  generalized  to  use  the  representation  field  of  the  plex  to  parse  the 
derivative-string,  and  divide  the  buffer  into  sections  showing  all  files,  directories, 
and  artifacts.] 
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3.3  Representation  of  Derivatives 

In  this  section,  we  describe  the  syntajc  and  semantics  of  the  representation  field  of  an  entry 
in  the  derivative-classes  plex.  The  rationale  behind  this  field  is  to  allow  the  extender  to 
give  a  structural  description  of  a  derivative  which  will  allow  the  artifacts  system  to  take  care 
of  adl  storage  management  of  derivatives.  For  example,  consider  the  simple  case  in  which  a 
derivative  string  for  a  kind  is  simply  a  file  name.  In  this  case,  the  representation  field  will 
be  the  letter  f ,  signifying  file.  As  a  string,  the  file  name  consists  of  a  sequence  of  digits, 
interpreted  as  a  file  name  relative  to  the  canonical  directory  for  derivatives  of  that  kind, 
henceforth  denoted  kind/.  Since  there  are  many  situations  in  which  there  is  significant  space 
savings  if  derivatives  can  share  data,  it  is  possible  for  the  same  file  to  appear  in  different 
derivatives.  The  question  then  becomes,  when  a  derivative  is  deleted,  should  the  file  be 
deleted?  Clearly,  the  answer  is  yes  if  and  only  if  there  are  no  more  derivatives  that  refer 
to  this  file.  To  do  the  bookkeeping  here,  when  a  deriver  reports  a  derivative  of  this  kind 
for  an  artifact  p,  the  system  knows,  because  the  representation  field  is  f ,  that  there  is  one 
more  (maybe  the  first)  reference  to  that  file  from  a  derivative,  and  places  an  inhibition  on 
that  file  (i.e.,  an  inhibition  against  deletion)  on  behalf  of  the  kind  derivative  of  p.  When  a 
derivative  of  an  artifact  is  deleted,  the  system  similarly  knows  to  remove  the  inhibition  of 
the  file  that  was  added  by  the  derivative.  When  the  last  inhibition  of  a  file  disappears,  the 
file  itself  is  deleted.  In  short,  the  system  does  all  the  storage  management  bookkeeping;  all 
the  derivative  class  extender  must  say  is  “the  representation  of  this  derivative  class  is  f”. 

A  related  use  to  which  the  representation  field  is  put  is  “repository  checking”.  Because 
the  representation  field  describes  how  data  is  laid  out,  it  is  possible  to  periodically  scan 
the  repository,  checking,  for  example,  that  all  files  that  should  exist  do  exist,  and  that  all 
existing  files  are  referenced  by  some  derivative. 

A  secondary  use  of  the  representation  field  is  in  providing  stamdard  behavior  for  exanine- 
derivative.  In  most  cases,  the  representation  field  provides  enough  information  to  display 
a  derivative-string  in  a  way  that  depends  only  upon  its  equivalence  class.  For  example,  when 
the  representation  field  is  f ,  the  display  of  it  to  a  user  is  a  buffer  filled  with  the  contents  of 
the  file.  We  could  also  use  the  representation  to  provide  a  standard  behavior  for  equivalence 
of  derivative  strings,  but  as  of  now,  there  is  no  reason  to  actually  implement  such  a  function. 

If  all  derivatives  were  represented  as  files,  there  would  be  no  need  for  a  representation 
field.  However,  the  representation  of  derivatives  can  be  quite  complicated,  in  part  because 
of  the  nature  of  existing  tools,  and  in  part  because  the  design  of  derivers  that  are  efficient  in 
incremental  situations  often  requires  relatively  complicated  data  structures  for  derivatives. 
The  representation  field  can  describe  such  data  structures,  and  as  a  consequence,  may  itself 
be  somewhat  complicated.  But  as  we  have  seen  with  the  f  example,  if  the  derivatives  are 
simple,  so  is  the  representation  field  that  describes  them. 

The  remainder  of  this  section  provides  the  specifics  of  what  can  currently  appear  in  the 
representation  field. ^  We  will  start  with  the  simplest  and  most  commonly  used  cases. 

is  our  plan  that  when  a  new  objectbase  becomes  available,  the  representation  field  will  need  to  be 
extended  to  accommodate  descriptions  of  derivatives  that  are  stored  in  that  object  base.  One  might  say  that 
the  only  objectbase  now  supported  is  the  Unix  file  system. 
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•  f — a  uid,  relative  to  kind/.  The  uid  names  the  file. 

—  Syntactically,  a  uid  appears  as  a  string  of  digits. 

—  If  a  new  derivative  refers  to  a  file,  an  “inhibition”  on  that  file  is  added  (i.e.,  an 
inhibition  against  deletion). 

—  When  a  derivative  is  deleted,  its  inhibition  is  deleted;  when  the  last  inhibition  is 
deleted,  the  file  is  deleted. 

—  [Currently,  inhibitions  are  not  created  and  removed  as  derivatives  come  auid  go. 
Instead,  the  derivative  representation  is  used  by  a  mark  and  sweep  garbage  col¬ 
lection  algorithm  that  is  run  as  part  of  the  daily  repository  check.] 

—  standard-exaffline-derivative  displays  the  contents  of  the  file  neimed  by  the 
uid. 

•  f  {derivative-class} — same  as  f ,  but  the  file  is  stored  relative  to  the  canonical  directory 
for  derivatives  of  the  given  derivative-class. 

•  — sames  as  f,  but  the  derivative  string  has  species/ uid  rather  than  just  uid.  The 
species  does  not  include  the  X;  in  particular,  /uid  indicates  that  the  kind  is  exactly  the 
derivative  class. 

•  f  .extension. — same  as  f,  except  that  the  name  of  the  file  in  kind/  has  .extension 
appended  to  the  uid.  The  extension  does  not  appear  in  the  derivative  string.  Extensions 
are  provided  only  for  compatibility  with  existing  UNIX  tools.  They  should  not  be  used 
unless  necessary. 

•  f<representation> — same  as  f ,  but  in  addition  the  file  has  contents  structured  according 
to  the  representation,  recursively.  (This  is  necessary  if  the  file  contains  embedded 
references  to  data  that  must  be  protected  against  garbage  collection.) 

•  f  {derivative-class}*/, .  extension . ,  f{derivative-class}y,,  f  {derivative-class} .  extension. , 
f%.  extension.,  plus  all  of  these  with  <representation>  tacked  on  to  the  end — the  obvi¬ 
ous  combinations  of  the  previous  four. 

•  d — like  f ,  but  the  entry  in  kind/  is  a  directory,  not  a  file. 

—  Inhibitions  are  handled  in  the  same  way;  when  the  Izist  one  goes,  the  contents  of 
the  directory  are  deleted,  and  then  the  directory  itself. 

—  standard-examine-derivative  lists  the  file  names  in  the  directory,  and  provides 
for  the  user  to  be  able  to  ask  to  see  the  contents  of  any  requested  file.  (The 
assumption  here  is  that  equivalence  of  directories  means  that  they  have  same- 
named  files  within  them  and  that  corresponding  files  have  the  same  contents.) 

There  is  no  provision  for  directories  with  extensions,  since  no  tool  requires  these. 
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•  b{<ype} — like  f ,  but  the  file  is  stored  in  the  directory  for  bodies  of  a  given  artifact 
type.  The  file  may  also  have  inhibitions  on  behalf  of  the  various  artifacts  whose  bodies 
refer  to  it. 

•  bffype}.  extension. — like  b,  but  the  file  has  the  indicated  extension. 

•  a — an  artifact. 

This  concludes  the  description  of  the  constructs  that  govern  storage  behavior  of  derivatives. 
The  rest  of  the  possibilities  for  a  representation  provide  the  ability  to  describe  derivative 
strings  with  structure  more  complex  than  the  above  possibilities.  First,  there  is  the  need  to 
specify  punctuation  between  pieces: 

•  '  character — The  character  appears  literally  in  the  derivative  string.  A  character  is  not 
quite  a  single  ASCII  character,  but  rather: 

—  A  single  character  other  than  \  stands  for  itself. 

—  \digits  stands  for  the  character  whose  code  is  given  by  the  digits,  in  octal. 

—  \\  stands  for  the  character  \. 

•  Zcharacter  , . .] — The  character  set  construct,  taiken  directly  from  Emacs  regular  ex¬ 
pressions,  except  that  \  obeys  the  above  quotation  convention. 

•  Except  when  preceded  by  a  '  or  inside  a  character  set,  all  white  space  in  a  represen¬ 
tation  is  ignored. 

The  representation  [\l-\377]  means  that  the  derivative  consists  of  a  single  non-null  char¬ 
acter  (the  possibility  that  the  derivative  consists  of  a  newline  is  excluded  by  other  means). 

The  remaining  ways  of  constructing  a  representation  all  involve  recursive  use  of  repre¬ 
sentation.  This  includes  at  least  the  power  of  regular  expressions.  Precedence  of  the  various 
constructs  will  be  summarized  at  the  end  of  the  section. 

•  representation  representation — concatenation.  For  example,  the  representation  f  ’  f 
says  that  the  derivative  string  consists  of  a  space-separated  pair  of  files. 

•  representation\  representation — alternatives.  For  example,  'Ffl'Dd  specifies  that  a 
derivative  string  is  either  F  followed  by  the  uid  of  file,  or  D  followed  by  the  uid  of  a 
directory.  NB:  Due  to  technical  limitations  in  the  regular  expression  machinery,  if  one 
alternative  can  match  a  substring  of  another  alternative,  the  shorter  alternative  must 
appear  last.  In  particular,  an  empty  string  is  a  legal  alternative,  but  it  must  be  the 
last  alternative. 

•  representation* — 0  or  more  repetitions.  For  example,  [\41-\177]*  specifies  a  deriva¬ 
tive  string  consisting  of  printing  ASCII  characters  with  no  indirect  storage. 

•  representation*<representation> — 0  or  more  repetitions  of  the  first  representation,  sep¬ 
arated  by  the  second  representation.  For  example,  f+<'  >  specifies  a  possibly  null 
space-separated  list  of  files. 
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•  representation* — 1  or  more  repetitions. 

•  representation*<representation> — 1  or  more  repetitions  of  the  first  representation,  sep¬ 
arated  by  the  second  representation. 

•  (representation) — grouping. 

Finedly,  there  are  situations  in  which  regular  expressions  are  not  adequate,  because  the 
derivative  is  tree-structured.  For  this,  there  is  a  way  of  introducing  a  nomenclature  for  a 
representation  and  then  referring  to  a  representation  from  within  it. 

•  '.name '.representation — the  name  may  consist  of  any  characters  oth<,;  than  :  and  *. 
There  is  no  name  hiding;  the  introduced  name  should  be  unique  v^  uhin  the  entire 
representation  for  a  class. 

•  •name^ — same  form  as  name  above.  This  may  occur  only  within  a  representation 
named  by  -.name:. 

As  an  example,  suppose  the  derivative  is  a  list  of  elements  each  of  which  is  either  an  integer 
or  a  parenthesized  list  of  the  same  possibilities,  recursively.  Then  the  representation  is: 

•  :tree:  ([0“9]  +  | '  (■tree*' '))+<'  > 

Strings  matched  by  this  representation:  “1  2  3”  and  “4  (S  6)  7”.  It  should  be  noted  that 
this  is  only  an  example,  and  is  not  particularly  useful,  because  it  does  not  specify  pointers 
that  lead  to  other  data.  A  more  realistic  example  uses  the  <representation>  option  for  the 
f  syntax.  Suppose  that  rather  than  parenthesized  lists,  the  tree  structuring  is  done  by  files. 
Because  a  file  is  syntactically  the  same  <is  a  sequence  of  digits,  the  syntax  specifies  that  files 
are  preceded  with  “®”: 

•  :tree: ([0-9]+| '®f<=tree*>)+<’  > 

[In  the  initial  implementation,  it  is  in  fact  a  requirement  that  -name-  be  inside  a  representa¬ 
tion  associated  with  an  s  which  is  in  turn  inside  the  representation  associated  with  :  name: . 
This  restriction  guarantees  that  any  particular  string  is  parsable  by  a  regular  expression.] 
The  promised  table  of  precedences,  from  highest  to  lowest: 

*  and  + 
juxtaposition 

I 

:  name: 

Precedence  is  of  course  overridden  with  (  )  and  <  >. 
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3.4  Fancy  Techniques  for  Representing  Derivatives 

Recall  the  derivative  class  c  from  section  3.1.  Suppose  we  notice  that  derivatives  of  class  c 
tend  to  be  large  and  that  significant  time  is  spent  in  re-deriving  them.  We  might  choose  an 
alternate  representation,  for  example; 

•  Either  a  derivative  is  a  hie, 

•  or  else  it  is  two  files,  one  of  which  is  a  base  file,  and  the  other  of  which  is  a  file  of 
“deltas”. 

(The  representation  field  would  be  “f  (  f  I)”.)  Just  to  be  specific,  let  us  suppose  that  a  file 
of  deltas  is  a  sequence  of  entries  each  having  the  format; 

•  — after  the  nth  character  in  the  base  file,  replace  the  next  m  characters 
by  string,  which  has  p  characters. 

Assume  such  a  file  is  sorted  by  n.  It  is  a  simple  matter  to  create  the  string  from  a  baise 
file  and  a  delta  file,  so  that  exaraine-derivative  produces  the  same  display  as  a  naive 
representation.  Similarly,  it  is  possible  to  write  a  filter  program  to  produce  the  stream  of 
characters,  which  can  then  be  fed  to  tools,  like  cc,  that  take  the  derivative  as  input.  With 
not  too  much  more  difficulty,  one  can  write  a  program  to  combine  two  deltas  into  a  single 
delta,  which  may  be  useful  in  certain  cases  of  incremental  derivation.  Note  that  deletion 
may  need  to  change  a  delta  representation  to  a  direct  representation. 

Whether  such  a  sophisticated  implementation  of  derivatives  would  pay  off  is  not  really 
the  issue.  That  has  to  be  decided  by  measurements  under  typical  patterns  of  use.  The  point 
is  that  there  is  sufficient  flexibility  in  the  system  to  allow  such  engineering  to  be  done  and 
yet  hide  it  from  the  user. 

A  delta  representation  is  one  technique  in  managing  derivatives.  A  second  technique  is 
“operator  derivatives”,  where  the  idea  is  to  package  the  information  in  a  way  that  maximizes 
the  amount  of  information  that  can  be  used  incrementally.  This  can  be  done  by  treating  the 
derivative  as  an  “application”  of  an  “operator”  to  some  “operands”.  The  operator  comes 
from  the  contents  of  the  artifacts  and  possibly  some  recisonably  stable  part  of  the  derivatives 
of  references.  It  has  “formal  parameters”  in  it.  The  operands  come  from  the  derivatives 
of  the  references.  If  the  contents  of  the  successor  do  not  change,  and  if  the  parts  of  the 
references’  derivatives  used  in  forming  the  operator  also  do  not  change,  a  new  artifact  may 
share  the  operator  part  of  its  predecessor’s  derivative,  even  if  the  operands  (the  references’ 
derivatives)  have  changed.  This  approach  makes  sense  when  references  change  more  often 
than  the  contents  of  artifacts  that  reference  them. 

Operator  representations  of  derivatives  are  useful  when  there  is  an  existing  Unix  tool 
which  follows  embedded  file  names,  and  when  the  definition  of  derivative  equivalence  is 
intended  to  cater  to  this  tool.  (For  example,  the  C  preprocessor,  when  it  encounters  an 
tinclude  macro,  treats  its  argument  as  a  file  name.)  The  “formal  parameters”  are  the  file 
names,  and  the  “application”  is  defined  as  running  the  tool  in  a  context  in  which  the  file 
names  are  linked  to  appropriate  parts  of  the  derivative.  In  this  case,  a  derivative  may  be 
represented  as  follows; 
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—  /  is  a  uid  relative  to  the  directory  kind/. 

—  Each  Ui  is  a  name  by  which  a  file  is  referenced  from  within  /  by  the  Unix  tool. 
(In  the  C  preprocessor  example,  the  derivative  would  contain  #include  "ui".)  It 
must  be  guaranteed  that  the  u,  associated  with  different  files  /  are  distinct. 

It  might  seem  that,  because  the  u,  are  nearly  arbitrary  choices,  in  some  sense  the  derivative  is 
not  determined  entirely  by  the  artifact,  but  of  course,  we  have  a  subtle  definition  of  derivative 
equivalence  in  mind,  specifically,  let  /QujQ  •  •  •  \^k  and  /'Qu'iQ  •  •  •  strings 

representing  derivatives.  They  are  equivalent  when  k  =  k'  and  when  they  pass  the  following 
test: 

•  Take  any  directory  d  in  which  the  names  Ui  and  u[  are  not  used,  and  take  any  absolute 
path  names  /i, . . . ,  Link  d/ fo  to  kind/ f\  link  d/u,  to  /,,  i  =  1, . . . ,  k\  and  run  the 
Unix  tool  in  directory  d.  Alternatively,  link  d/fo  to  kind/ f  and  d/uj  to  /,,  and  run 
the  tool.  There  will  be  no  difference  in  the  effects  of  the  two  tool  invocations. 

Loosely  speaking,  /  and  /'  have  the  same  contents,  except  that  where  /  refers  to  a  file  u„ 
/'  refers  to  u\.  It  turns  out  that  it  is  not  necessary  to  literally  implement  this  definition 
of  equivalence;  the  main  point  here  has  been  to  give  an  example  of  derivative  equivalence 
which  is  tailored  to  the  semantics  of  an  existing  tool.  It  is  also  necessary  to  define  viewing 
and  exporting  of  the  derivative  appropriately;  it  would  not  be  kosher  to  expose  the  u,  or  u\. 
How  to  do  this  is  a  second-order  detail  which  we  do  not  discuss  further  here. 

A  downside  to  cleverly  engineered  representations  of  derivatives  is  that  they  become  more 
complex  to  use,  and  that  all  the  tools  that  use  the  derivatives  of  a  particular  class  must  deal 
with  that  complexity.  However,  our  experience  has  shown  that  derivatives  of  a  given  class 
are  ultimately  consumed  by  only  one  tool.  For  example,  c- source  derivatives  may  be  spliced 
into  other  c-source  derivatives,  but  are  eventually  processed  to  produce  c-f  ile  and  h-f  ile 
derivatives.^ 

3.5  error  Derivatives 

Many  derivers  detect  errors  in  their  inputs,  for  example,  a  compiler  will  detect  syntax  er¬ 
rors.  Almost  all  such  errors  should  be  reported  as  derivatives  with  class  error,  for  reasons 
discussed  in  the  last  paragraph  of  section  3.1.  As  we  noted  in  section  2.12,  such  derivatives 
logically  consist  of  a  list  of  files  and  a  string.  The  encodement  into  a  derivative  string  is  as 
follows: 

•  Zero  or  more  uids  relative  to  the  kind  in  question,  separated  by  spaces. 

•  Either  nothing  else,  or  a  vertical  bar  ( I )  followed  by  arbitrary  text  (not  containing  a 
null,  “A,  or  newline). 

A  deriver  produces  such  a  derivative  in  the  same  way  that  it  would  produce  any  other 
derivative. 

current  exception  to  this  is  manuscript  derivatives,  which  are  used  not  only  in  typesetting  but  also 
in  extracting  program  text.  This  is  a  temporary  aberration  resulting  from  historical  use  of  files. 
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3.6  universal  Artifacts 

The  artifact  type  universal  allows  the  specification  of  any  desired  derivative  (well,  almost). 
The  contents  of  such  an  artifact  is  a  series  of  lines  having  one  of  two  forms: 

•  comment — empty  or  beginning  with  a  non-alphabetic. 

•  kind^derivative-ittring 

The  universality  comes  from  the  fact  that  a  universal  artifact  has  a  derivative  of  kind  k 
if  and  only  if  k  is  among  the  kinds  listed  in  its  contents,  and  the  derivative  is  given  by  a 
derivative  string  associated  with  k,  with  suitable  provisions  for  files,  directories,  and  artifacts, 
discussed  below. 

First,  a  caveat.  To  construct  a  universal  artifact,  you  must  understand  the  precise  form 
expected  of  the  derivative  string.  The  system  will  help  you  as  far  as  it  can — it  will  use  the 
representation  field  of  the  derivative-classes  plex  to  ensure  that  each  derivative  string 
has  the  right  format — but  there  may  be  subtle  properties  that  are  required  of  a  derivative  of  a 
given  kind  that  are  not  grammatically  specifiable:  univei  :»al  artifacts  are  for  professionals; 
do  not  attempt  to  use  them  at  home. 

Now  we  describe  the  provisions  for  the  parts  of  a  derivative  string  that  “point”  to  data. 
The  simplest  of  these  is  an  artifact  pointer,  indicated  in  a  derivative  representation  by  a.  In 
a  universal  draft,  simply  insert  an  artifact  reference  at  the  point  in  the  derivative  string 
where  you  want  the  artifact  pointer.  In  the  draft,  or  in  the  universal  artifact  obtained  by 
committing,  this  will  look  like  an  ordinary  artifact  reference.  But  in  the  derivative  string,  it 
will  appear  as  the  desired  artifact  pointer. 

A  derivative  representation  indicates  a  file  by  f  [{dent;a<tre-c/ass}][%][.  extension .]  or  by 
b{<ype}[.  extension.].  Of  these  possibilities,  only  f[.  extension.)  can  be  produced  by  the 
deriver  for  universal  artifacts.  There  are  two  ways  of  specifying  the  derivative  string  for 
this  case,  allowing  you  to  choose  between  data  integrity  and  reduced  disk  space.  The  reason 
for  allowing  you  to  sacrifice  data  integrity  is  that  there  are  files  outside  the  artifacts  system 
that  one  cannot  afford  to  copy  into  the  artifacts  system,  for  example,  compiler  executables  or 
a  Lisp  world.  (Not  to  mention  the  fact  that  some  compilers  must  reside  at  a  particular  place 
in  the  file  system.)  While  it  is  a  practical  necessity  to  allow  pointers  out  to  the  mutable  world, 
it  is  also  desirable  to  retain  the  immutability  properties  of  artifacts  and  derivatives.  One 
way  to  compromise  between  these  irresoluble  differences  is  to  specify  a  file  in  a  universal 
draft  by  an  absolute  path  (i.e.,  beginning  with  a  /),  that  you  highlight  (use  C-z  C-h)  to 
distinguish  it  from  other  text  in  the  derivative  string.  Highlighting  this  text  first  checks  that 
the  file  exists,  and  does  not  succeed  if  there  is  no  file.  If  the  highlight  is  successful,  you  will 
see  that  the  path  is  now  followed  by  two  numbers,  the  length  in  bytes  and  a  check  sum, 
together  called  a  certificate.  (Computing  the  check  sum  may  take  a  while  on  large  files.) 

One  of  the  uses  of  a  certificate  is  in  derivation.  We  now  explain  the  “suitable  provision” 
for  a  file  reference  that  is  an  absolute  path.  The  deriver  checks  to  see  that  the  indicated  file 
still  exists,  and  if  so,  whether  it  still  has  the  same  certificate.  If  so,  it  allocates  a  new  uid  for 
the  kind  in  question,  and  establishes  a  soft  link  from  this  uid  (possibly  with  an  extension) 
to  the  absolute  path.  If  not,  it  produces  a  bad  derivative. 
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Because  the  derivative  “file”  is  a  link,  it  is  subject  to  deletion  or  change  not  under  control 
of  the  artifacts  system;  this  is  the  price  paid  for  not  duplicating  the  data.  However,  to  provide 
a  modicum  of  integrity,  the  repository  checker  periodically  runs  the  following  algorithm: 

•  For  every  universal  artifact: 

—  For  every  absolute  path  in  a  derivative  string: 

*  If  the  derivative  exists: 

•  Let  the  uid  for  the  file  be  u. 

•  If  the  file  u  (i.e.,  the  file  to  which  it  is  linked)  has  a  certificate  different 
from  that  indicated  in  the  argument,  post  a  notice  to  the  creator  of  every 
artifact  in  the  transitive  closure  of  references  to  the  artifact  in  question, 
unless  there  is  already  such  a  notice. 

The  system  does  not  delete  any  suspect  derivatives,  because  the  file  in  question  may  have 
been  only  temporarily  corrupted.  However,  while  the  system  does  not  pile  up  redundant 
notices  for  any  user,  the  notice  will  keep  coming  back  until  the  problem  is  fixed.  Thus,  even 
though  the  integrity  of  the  derivatives  is  not  guaranteed,  violations  of  that  integrity  will, 
with  high  probability,  be  reported  to  an  interested  party  within  a  finite  amount  of  time,  and 
will  effectively  persist  until  the  problem  is  cleared. 

Another  way  to  specify  a  file  part  of  a  derivative  string  is  to  use  an  “internal  pointer”.  In 
this  case,  the  integrity  of  derivatives  is  guaranteed,  and  the  file  is  stored  within  the  artifacts 
system.  To  use  this,  highlight  the  character  C,  which  will  turn  into  (On  for  some  n,  and  a 

divider  of  the  form - File  n - -  •  •  in  inverse  video)  will  appear  at  the  end  of  the  buffer. 

Insert  the  desired  contents  after  the  divider.  If  there  are  several  internal  pointers  to  files, 
there  will  be  several  dividers,  effectively  dividing  the  buffer  up  into  “internal  files”.  When  the 
draft  is  committed,  the  data  for  each  internal  file  will  be  stored  in  a  real  file,  and  derivation 
will  establish  links  to  these  files,  much  the  same  as  for  an  absolute  path,  except  that  because 
the  file  is  under  control  of  the  artifacts  system,  there  is  no  danger  of  its  becoming  invalid. 
[It  may  be  necessary  to  provide  a  mechanism  for  giving  execute  permission  to  real  files.]  If 
you  edit  an  artifact,  you  can  then  edit  the  internal  files.  The  new  artifact  will  share  files 
with  a  predecessor  if  possible,  but  will  create  new  files  as  necessary. 

Similar  provisions  could  be  provided  for  directories,  represented  by  d  in  a  derivative 
representation.  However,  the  need  for  these  is  judged  to  be  low  priority,  and  the  details  of 
the  user  documentation,  not  to  mention  the  implementation,  will  be  postponed  until  there 
is  a  felt  need. 
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4  Derivers 

The  subject  here  is  how  to  write  derivers  and  install  them.  This  chapter  assumes  that  the 
reader  has  used  the  E-L  system,  and  so  is  familiar  with  jobs  being  run  on  the  user’s  behalf. 
It  also  assumes  that  the  reader  already  understands  the  representation  of  artifacts  whose 
type  is  that  of  interest  and  the  details  of  their  representation  for  the  desired  derivative  class. 

4.1  How  Jobs  are  Scheduled 

The  purpose  of  this  section  is  to  provide  an  extender  with  an  accurate  mental  model  of  how 
jobs  axe  run,  and  thereby  provide  a  framework  in  which  the  specifics  of  remaining  sections 
make  sense. 

Scheduling  jobs  is  a  distributed  process;  there  is  no  central  scheduling  agent.  An  artifact 
typically  has  some  jobs  scheduled  for  it  when  it  is  committed,  and  the  user  can  explicitly 
schedule  jobs  using  derive-from.  But  requirements  for  additional  derivatives  also  arise 
during  derivation,  and  these  requirements  may  entail  further  scheduling.  If  so,  exactly  the 
same  algorithm  and  the  same  shared  data  structures  are  used  as  in  the  case  of  scheduling, 
either  implicitly  or  explicitly,  that  originates  in  an  editor  session. 

The  scheduling  problem  is  this:  given  an  artifact  p  of  type  t,  and  a  kind  of  k  for  a  desired 
derivative  of  p,  how  does  the  system  ensure  that  the  derivative  will  eventually  be  computed? 
There  are  several  plexes  that  are  involved  in  scheduling  jobs.  To  simplify  the  description  of 
how  jobs  are  scheduled,  this  section  treats  all  of  them  as  conceptually  being  relations,  and 
ignores  the  parts  of  the  plexes  not  having  to  do  with  scheduling. 

•  present — each  tuple  in  this  relation  corresponds  to  a  job  server,  i.e.,  a  process  capable 
of  running  jobs.  The  fields  are: 

—  id — unique  to  a  job  server;  it  can  be  used  to  communicate  with  it. 

—  job-server-class — a  string  that  says  what  kinds  of  jobs  this  job  server  is  capable 
of  performing. 

—  idle — a  boolean,  indicating  whether  this  job  server  can  be  sent  a  new  job. 

—  current-jobs — a  set  of  pairs  of  the  form  (p,,  Ki),  where  pi  is  an  artifact  pointer  and 
Ki  is  a  set  of  kinds,  indicating  which  derivatives  of  which  artifacts  are  presently 
being  computed  by  this  job  server. 

•  future — each  tuple  in  this  relation  is  a  job  that  is  waiting  to  be  run.  The  fields  are: 

—  artifact 

—  kind 

—  priority — low  numbers  win. 

—  timestamp — when  the  entry  was  made;  used  in  tie-breaking  in  case  of  identical 
priorities. 
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Important  invariant:  if  (p,  A:,  •••)  is  in  the  future  plex,  then  there  is  no  (p,  A')  with 
ib  €  A  in  a  current-jobs  field  of  any  entry  in  the  present  plex. 

e  workers — this  relation  has  the  fields: 

—  worker — a  string  with  an  associated  executable. 

—  tool — the  name  by  which  the  outside  world  refers  to  a  function  within  the  ex¬ 
ecutable.  (For  derivers,  the  name  of  the  tool  is  generally  of  the  form  derive- 
type/kind.  We  will  see  later  that  there  are  tools  other  than  derivers.) 

—  job-server-class — says  which  job  servers  (elements  of  the  present  plex)  are  al¬ 
lowed  to  run  this  tool. 

A  given  worker  may  appeeu'  in  many  tuples. 

•  derivers — fields: 

—  tool — the  name  of  the  tool  capable  of  computing  a  set  of  derivatives  from  an 
artifact,  or  else  deciding  that  such  a  derivative  definitely  does  not  exist. 

—  type — the  type  of  some  artifact. 

—  kind-set — the  set  of  kinds  that  will  be  produced  by  the  deriver. 

The  derivers  plex  does  not  say  that  the  derivative  of  a  certain  kind  necessarily  exists; 
it  may  be  necessary  for  the  tool  to  take  a  closer  look  to  decide  that. 

The  reader  is  warned  that  “conceptual”  is  the  operative  word  for  this  relation — if 
represented  literally  as  a  set  of  tuples,  it  would  be  infinite;  moreover  the  kind-set  in 
each  tuple  might  be  infinite.  The  finesse  will  be  explained  in  section  4.4. 

We  now  answer  the  scheduling  problem  quite  succinctly: 

•  Given  an  artifact  p  and  kind  k,  is  there  some  (p,  K)  with  k  £  K  in  the  current-jobs  field 
of  some  entry  in  the  present  plex?  If  so,  the  job  is  already  being  run.  (In  practice, 
the  job  server  can  be  told  to  notify  the  new  requester  when  the  job  is  done.) 

•  If  the  job  is  not  under  way,  let  t  be  the  type  of  p,  and  ask  whether  there  exists  a  tool 
T,  a  kind  set  K,  a  worker  w,  a  job-server-class  c,  and  a  job  server  id  j  such  that: 

—  {T,t,K)  €  derivers  and  k  €  K 

—  {w,T,c)  €  workers 

—  (7,c,true,  •  •  •)  €  present  (i.e.  an  idle  job  server) 

If  so,  send  j  a  message  telling  it  to  get  busy  deriving  kind  k  from  artifact  p  using  tool 
T  in  worker  w.  Moreover,  add  {p-,K)  to  the  current-jobs  field  of  j. 

•  If  there  do  not  exist  T,  K,  w,  and  c  such  that  the  first  two  conditions  hold,  then  the 
job  is  not  schedulable. 
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•  Otherwise,  put  p  and  k  into  the  future  plex  (with  suitable  priority  and  timestamp). 

There  is  one  other  situation  in  which  scheduling  is  done,  namely  when  a  job  server  is  about 
to  become  idle.  Before  setting  its  idle  field  to  true,  a  job  server  with  class  c  «isks  if  there 
exists  an  artifact  p  of  type  t,  kind  k,  kind-set  K,  tool  T,  and  worker  w  such  that: 

•  (p»^>  •  •  •)  €  future 

•  (T,  t,  K)  €  derivers  and  k  £  K 

•  {w,T,c)  €  workers 

If  so,  it  picks  the  p,k  with  the  smallest  priority/time  in  the  future  plex,  and  removes  it 
along  with  any  other  p,k‘  where  k'  €  K.  This  maintains  the  invariant  relating  the  present 
and  future  plexes.  Only  if  no  such  p,  k,T,  K,w  exists  does  the  job  server  become  idle. 

As  stated  earlier,  a  tool  in  the  above  relations  must  be  connected  with  a  function  in¬ 
side  a  worker.  However,  a  worker  and  a  job-server-class  have  no  meaning  other  than  their 
use  as  strings  in  making  connections  in  these  relations.  In  other  words,  a  worker  can  be 
renamed  at  will  in  the  workers  plex,  without  effect  on  scheduling.  (It  can’t  be  changed  to 
an  existing  worker  though).  Similarly,  changing  a  job-server-class  uniformly  throughout  the 
workers  and  present  plexes  to  a  previously  unused  job  server  class  will  not  affect  scheduling 
subsequent  jobs. 

Observe  that  the  present  and  future  plexes  are  modified  as  jobs  are  scheduled,  but 
the  workers  and  derivers  plexes  are  not.  Rather,  they  are  modified  by  the  extender  to 
direct  jobs  to  the  right  tool  in  the  right  worker  for  the  right  job  server.  Section  4.3  and 
4.4  discuss  manipulating  these  plexes,  and  the  subsequent  sections  describe  how  to  write 
a  deriver.  Observe  that  the  order  in  which  you  actually  do  an  extension  is  the  reverse  of 
the  order  of  the  presentation — first  you  have  to  write  the  derivers,  then  put  the  tool  in  the 
derivers  plex,  and  then  install  the  worker. 

4.2  The  hosts  plex 

(A  slight  detour  before  details  of  deriver  installation).  The  hosts  plex  is  not  used  during 
scheduling,  but  it  is  relevant  here,  because  it  is  used  by  job  servers.  A  line  in  the  plex: 

•  host  machine-type  default-job-server-class-variant 

—  The  machine-type  is  not  at  the  discretion  of  the  extender,  but  is  computed  from 
the  host  and  cached  in  the  plex.  See  Porting  E-L  for  a  full  discussion. 

—  The  usual  start-up  of  a  job-server  on  the  host,  e.g.,  with  M-x  start- job-server, 
concatenates  the  machine-type  and  default-job-server-class-variant  to  obtain  the 
job-server-class  for  the  job-server’s  entry  in  the  present  plex. 

It  is  also  possible  to  start  a  job  server  with  a  given  value  for  its  job-server-class,  in  which 
case  it  ignores  the  hosts  plex.  [PRESENT  REALITY:  there’s  no  job-server-class  apart  from 
the  machine-type  of  the  host.] 

The  hosts  plex  is  manipulated  with  the  following  commands: 
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•  nev-host  host  default-job-server-class-variant 

—  The  host  must  not  already  be  in  the  plex. 

—  If  the  host  is  not  the  one  on  which  the  command  is  issued,  this  command  will  do 
a  remote-shell  to  the  host  in  order  to  compute  the  machine-type.  Thus,  you  must 
have  remote  login  privileges  on  that  machine. 

—  A  common  value  for  the  default-job-server-class-variant  is  the  null  string. 

•  repair-host  host  default-job-server-class-variant 

—  Used  to  change  the  variant. 

•  delete-host  host 

4.3  The  workers  Plex 

In  the  previous  section,  we  described  the  workers  plex  as  conceptually  a  relation.  Here  we 
describe  its  actual  representation,  and  the  connection  to  the  conceptual  relation.  A  line  in 
the  workers  plex  has  the  format: 

e  worker  executable-file  machine-type  {tooli}i 

—  The  machine-type  indicates  on  which  architectures  the  worker  can  run.  While 
this  field  can  contain  the  empty  string,  for  now  it  oughtn’t.  ’ 

—  Let  a  line  be  w,  /,  m,  {Ti}».  Its  contribution  to  the  conceptual  relation  of  sec¬ 
tion  4.1  is  the  set  of  tuples  (ly, 7i,Cj),  for  all  i  and  j.^ 

—  The  executable-file  field  is  the  file  outside  the  repository  from  which  the  installed 
executable  for  the  worker  was  copied  or  moved.  In  a  system  newly  installed  from 
a  distribution,  this  field  will  refer  to  the  file  in  the  distribution.  Ultimately,  it 
will  be  possible  to  install  a  worker  via  an  artifact,  in  which  case  this  field  will 
be  have  the  artifact  and  a  machine  variant  pair  m-v,  where  the  executable  is  the 
executable'/im-v  derivative  of  the  artifact. 

In  a  view  on  the  workers  plex,  the  fields  are  displayed  vertically,  with  the  tools  under¬ 
neath  the  executable.  [PRESENT  REALITY:  just  one  line  per  worker  entry.] 

'Proposed:  If  this  field  is  null,  it  works  for  all  machine  types  (and  is  presumably  a  shell-script). 

^There  is  presently  only  one  job  server  class,  which  is  currently  implicit  in  each  line  of  the  workers  plex. 
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c-sorker-»un3  sunS  C  /usr/local/«-l-<listribution/8un3/c-workar-sun3 

d«riv«-c-80urce/c-text 
derive-c-module/c-and-h-f ile 
dariv«-c-module/object-8un3 
dariTa-c-fflodula/axacutable-8un3 

c-vorkar-Bipa  mipa  C  /uar/local/a-l-distribution/mipa/c-aorker-mips 

dariva-c-aourca/c-taxt 
dariva-c-modula/c-and-h-fila 
dariva-c-modula/obj act -mips 
dariva-c-modula/axacutable-mips 

The  workers  plex  is  manipulated  with  the  following  commands:^ 

a  nev-vorker  worker  machine-type  job-server-class-list  file- or- artifact  tool-list 

—  The  worker  must  not  already  exist,  and  all  the  tools  in  the  tool-list  must  already 
exist  (e.g.,  be  registered  in  the  derivers  plex). 

—  When  called  interactively  without  a  prefix  argument,  this  will  prompt  for  an 
artifact  argument  for  the  file-or-artifact;  otherwise,  it  will  prompt  for  a  file.  With 
one  C-u,  the  file  will  be  copied;  with  two,  it  will  be  moved. 

a  repair-worker  worker  machine-type  job-server-class-list  file-or-artifact  tool-list 

—  The  arguments  are  the  same  as  for  new-worker. 

—  This  command  first  ensures  that  no  job-server  is  actively  running  the  worker. 

—  Default  values  for  the  arguments  are  supplied  using  the  values  already  in  the  plex. 
You  must  confirm  that  you  really  do  want  a  new  executable  installed. 

a  delete-worker  worker 

—  This  command  first  ensures  that  no  job-server  is  actively  running  the  worker. 

As  an  aside,  we  note  that  a  tool  may  be  present  in  a  worker  executable,  but  be  prevented 
from  running  on  certain  hosts,  by  making  no  entry  in  the  workers  plex  for  a  given  job  server 
class.  So  to  prevent  cc  compilation  from  running  on  a  certain  host,  say,  one  could  give  it  a 
default  job  server  class  X  (in  the  hosts  plex),  and  set  up  the  workers  plex  as  follows: 

c-worker-sun3  8un3  X  /usr/local/e-l-distribution/sunS/c-worker-sunS 

derive-c-source/c-text 
derive-c-module/c-and-h-f ile 

Thus,  two  job  servers,  one  with  class  C  and  one  with  class  X,  can  share  executables,  but  only 
the  job  server  class  C  will  receive  jobs  where  the  tool  is  derive-c-module/object5isun3  or 
derive-c-Bodule/executable%sun3. 

^[Present  reality:  a  tool-list  has  only  one  tool,  but  there  are  extra  commands  repair-worker-new- 
tool,  repair-worker-delet«-tool,  and  repair-worker-executable.] 
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4.4  The  derivers  Plex 

Recall  from  section  4.1  that  the  derivers  plex  is  conceptually  an  infinite  set  of  tuples  of  the 
form  {T,t,K).  The  finesse  to  gain  finiteness  is  a  two-fold  reliance  on  regular  expressions. 
First,  there  is  the  fact  that  a  regular  expression  itself  can  represent  an  infinite  set  of  strings, 
namely  those  that  it  matches.  Second,  there  is  the  fact  that  using  the  \(. . .  \)  construct  of 
Gnu  EmacS  (see  the  manual),  a  substring  of  a  matched  string  can  be  extracted,  and  using 
the  \i  construct  (ibid.)  for  replacing  the  t***  match,  the  extracted  string  can  be  substituted 
into  another  regular  expression.  This  double  reliance  on  regular  expressions  disposes  of  two 
levels  of  infiniteness — the  infinite  relation  with  infinite  sets  inside.  An  example  will  follow 
the  general  description  of  a  line  in  the  derivers  plex: 

•  artifact-type  tool  kind-re  kind-re-replacement 

—  The  semantics  of  this  line  is  described  by  giving  the  tuples  that  it  contributes  to 
the  conceptual  relation  used  in  section  4.1. 

—  Let  T  be  the  tool,  t  be  the  artifact  type,  re  be  the  kind-re  and  re-r  be  the  kind-re- 
replacement.  For  every  string  s  matched  by  re,  there  is  one  tuple  contributed  to 
the  conceptual  relation.  To  obtain  it,  take  the  substrings  of  s  located  by  re,  and 
substitute  these  for  occurrences  of  \i  in  re-r,  obtaining  a  new  regular  expression 
re'.  The  tuple  is  {T,i,K),  where  K  is  the  set  of  all  strings  matching  re'. 

—  The  use  of  kind-re-replacement  is  almost  redundant.  It  is  currently  used  to  al¬ 
low  one  to  specify  grouping  in  re'  without  having  it  misinterpreted  in  re  as  a 
substitution  position. 

—  kind-re-replacement  should  always  be  written  so  that  the  final  re'  matches  a  sub¬ 
language  of  re.  That  is,  no  deriver  should  produce  a  derivative  that  cannot  be 
directly  scheduled. 

To  get  a  feel  for  how  this  works  in  practice,  we  look  at  two  real  examples.  First  consider 
derivers  that  obtain  object  files  from  c-module  artifacts.  (See  the  C  and  Unix  Programming 
Manual  if  you  are  not  familiar  with  this  artifact  type.)  It  is  necessary  to  derive  the  object 
files  for  different  machines  with  different  tools,  so  we  use  the  kind-re  to  distinguish  them.  In 
the  example  entry  below,  the  derivative  class  is  ob j  ect  or  error;  the  variant  is  covered  by 
the  regular  expression  \(-[A-Za-zO-9]+\|\) 

derive-object-sun3  c-module  \(object\  I  error\)y.sun3\(-  [A-Za-zO-9]  +\  I  \  ) 

\ (ob j  ect\ I error\)Xsun3\2 

For  the  strings  s  =  object)isun3u  or  errorXsunSu,  the  tuple  is: 

(derive-object-s\m3,  c-module,  {object7.sun3t;,  error*/,sun3r>}) 

Here  v  ranges  over  the  empty-string  and  strings  beginning  with  -  followed  by  alphanumeric. 
Thus,  there  are  an  infinite  number  of  tuples,  but  the  kind  set  in  each  is  finite,  namely,  it 
has  two  elements.  This  reflects  the  fact  that  producing  an  object  file  is  inseparable  from 
producing  the  error  file  associated  with  the  compilation. 
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For  the  second  example,  consider  shellscript  artifacts  (ibid).  Here,  there  are  two  new 
issues:  the  manuscript  derivative  is  obtained  by  a  type  setting  tool,  but  all  other  derivatives 
are  obtained,  if  they  can  be  obtained  at  all,  by  a  tool  that  runs  the  shellscript.  So  the  relevant 
regular  expressions  are: 

dariTa'shellscript-manuscript  shellscript  manuscript  manuscript 
dsrive-shellscript-others  shellscript  \ 'manuscript  \ 'manuscript 

The  X"  operator  means  the  complement  of  the  regular  expression  following  it.  The  second 
line  here  contributes  the  tuple: 

(derive-shellscript-others,  shellscript,  {all  strings  /  "m«inuscript"}) 

So  here  there  is  only  one  tuple,  but  its  kinds-set  is  infinite. 

The  following  commands  modify  the  derivers  plex. 

•  new-deriver  tool  artifact-type  kind-re  kind-re-replacement 

—  The  tool,  artifact-type  pair  may  not  already  exist. 

—  The  artifact-type  must  already  be  registered  in  the  types  plex. 

•  transpose-derivers  mark  point 

—  This  command  can  be  issued  only  from  a  view  on  the  derivers  plex,  with  a  mark 
set  on  line  and  the  cursor  on  another.  It  exchanges  the  two  lines. 

—  Rationale:  the  order  of  lines  in  the  derivers  plex  is  significant,  so  it  must  be 
possible  to  change  the  order  of  entries. 

•  repair-deriver  tool  artifact-type  kind-re  kind-re-replacement 

—  The  current  values  of  the  last  three  arguments  are  presented  ^ls  defaults,  making 
it  easy  to  change  only  one  of  them. 

e  delete-deriver  tool 


4.5  Writing  a  Deriver 

This  section  gives  a  language-independent  description  of  how  to  write  a  deriver.  VVe  will 
discuss  separately  how  to  map  this  abstract  specification  to  the  concrete  rules  for  different 
implementation  languages,  currently  C,  Emacs  Lisp,  and  Common  Lisp. 

A  deriver  is  implemented  ais  a  set  of  functions  {/i},,  many  of  which  can  be  null,  indicating 
some  default  behavior.  These  functions  are  called  by  the  general  derivation  mechanism, 
which  also  has  as  arguments  the  artifact  under  derivation,  which  we  shall  consistently  refer 
to  as  p,  and  the  set  of  kinds,  each  of  which  was  matched  by  the  kind-re  field  for  the  derivers’s 
line  in  the  derivers  plex.  A  deriver  is  given  the  chance  to  consider  relatives  of  p,  i.e.,  artifacts 
that  are  historiczdly  related  to  p.  For  now,  the  only  relatives  are  immediate  predecessors, 
but  this  may  be  generalized  in  the  future. 

The  functions  /i,  /j,  and  f^  take  as  their  first  three  arguments: 
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•  The  body  of  an  artifact  (either  p  or  one  of  its  relatives). 

•  The  artifact  cell.  The  deriver  should  not  modify  any  slots  in  the  cell. 

•  The  location  of  an  arbitrary  datum,  e.g.  the  location  of  the  contents  of  the  artifact;  /i, 
fly  or  /a  can  store  in  this  location  zmy  data  that  must  be  passed  between  /^’s.  location 
initially  contains  a  null,  and  /i,  /a,  or  /a  can  set  and  read  its  contents.  and  /s  can 
read  location's  contents,  because  the  value  in  location  becomes  the  value  of  the  contents 
slot  of  the  corresponding  dossier  (see  below). 

The  remaining  arguments  and  /«  and  /$  are  described  below. 

To  declare  a  deriver,  enter  a  call  on  the  following  function  in  the  initialization  of  an 
appropriate,  necessarily  language-specific,  worker: 

•  enter-deriver  name  fi, fk  job-server-incremental 

—  The  name  is  a  string  that  names  the  tool.  The  standard  convention  is  to  use 
type/ kind,  and  when  more  than  one  kind  can  be  produced,  to  choose  the  principal 
one  or  a  suggestive  term.  The  name  is  not  parsed  by  the  system;  as  explained  in 
section  4.1,  it  is  used  only  as  a  match  of  the  same  name  in  the  derivers  plex. 

—  The  fi  must  occur  in  the  order  listed  below. 

-  job-server-incremental  should  be  true  iff  none  of  the  derivatives  produced  by  the 
deriver  ever  embeds  the  artifact  pointer  of  the  argument  artifact  (or  the  point¬ 
ers  of  references  if  these  do  not  appear  in  the  derivative  strings  of  the  refer¬ 
ences).  Otherwise  job-server-incremental  should  be  false.  More  precisely,  job- 
server-incremental  is  true  when  the  job  server  can  apply  its  default  incremental- 
ity  machinery  based  on  body  strings  and  the  derivative  strings  of  requisite  and 
reference  artifacts. 

Now  the  specifics  of  the  /,.  Each  one  is  given  a  suggestive  name,  and  an  argument/result 
specification. 

o  fi  =  computable-kinds:  body  type  location  requested-kind- sequence  — +  kind/uid-request- 
sequence 

—  requested-kind-sequence  is  the  set  of  requested  kinds. 

—  The  result  is  a  sequence  of  kind/uid-request  objects  composed  of: 

♦  A  kind  for  p  that  this  deriver  will  produce  given  requested-kind-sequence.  This 
may  be  a  kind  not  in  requested-kind-sequence.  For  example,  a  deriver  that 
is  asked  to  produce  an  object  file  might  yield  both  an  object  and  an  error 
derivative. 

♦  uid-count,  an  integer  that  indicates  the  number  of  uids  that  likely  to  be 
needed  for  kind.  This  is  an  estimate  needed  for  efficiency  only.  Calls  to 
functions  that  return  uids  (see  4.6)  will  always  succeed,  but  requests  beyond 
the  number  given  by  uid-count  will  result  in  additional  (and  somewhat  costly) 
communication  with  the  job  server. 
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-  The  default  result  (i.e.,  the  one  used  if  a  function  is  not  supplied),  is 

{{k,0))keTtii%etied-kind  ieiuencti  i-^-  the  kinds  of  requested-kind-sequeiict  with  no  uids 
requested. 

0/2=  requisite-kinds:  body  type  location  —*  kind-sequence 

—  The  result  is  the  set  of  derivative  kinds  for  p  that  are  needed  for  this  deriver  (and 
presumably  computed  by  some  other  deriver). 

—  For  example,  in  deriving  an  object  file,  the  result  would  be  {c-f  ile}. 

—  The  default  result  is  the  empty  set. 

0/3  =  reference-handle-kinds:  body  type  location  integer  —*  handle-kind-sequence 

—  The  result  is  a  set  of  handle/kind  pairs  that  specify  which  kinds  of  derivative  are 
needed  from  which  references.  The  same  handle  may  appear  more  than  once, 
with  different  kinds. 

—  This  function  is  not  called  if  there  are  no  references. 

—  The  integer  is  the  number  of  references  of  the  artifact,  a  convenience  for  situations 
in  which  the  same  kind  of  derivative  is  needed  from  each. 

—  For  example,  to  compute  the  c-text  derivative  of  c-source  artifacts,  the  result 
is  {(0,  c-text), . . . ,  (n  —  1,  c-text)},  where  n  is  the  integer  argument. 

—  The  default  result  is  the  null  set. 

The  next  two  functions  use  a  data  structure  called  a  dossier,  summarizing  information 
about  derivatives  of  an  artifact  and  its  references.  The  fields  of  a  dossier  are: 

•  body — The  deriver  should  not  modify  this  field. 

•  contents — What  was  placed  in  location  by  /i,  /2,  or  fs. 

•  kind-derivative-sequence — A  list  of  kind/derivative-string  pairs,  which  we  denote 

{ ( requisite-ki ,  requ isite-d- ) } , . 

The  requisite-ki  are  from  the  result  of  requisite-kinds  on  the  artifact;  the  requisite-d, 
have  been  obtained  and  filled  in  by  the  job  server.  For  p,  all  of  the  requisiie-d^  will  be 
non-null,  because  they  will  have  been  computed  if  necessary.  For  relatives,  any  or  all 
of  the  requisite-di  may  be  null,  because  they  are  supplied  only  if  they  already  exist. 

The  deriver  should  not  modify  this  field. 

•  handle-kind- derivative-sequence — A  list  of  handle/kind/derivative-string  triples,  which 
we  will  denote  by  {{hj,kj,dj)}j.  The  hj,kj  are  from  the  result  of  reference-handle- 
kinds  on  the  artifact  and  are  in  the  same  order;  the  dj  have  been  supplied  by  the  job 
server.  As  in  the  previous  Ccise,  the  dj  for  p  will  all  be  present,  while  for  the  relative, 
only  the  previously  existing  derivative  strings  will  be  non-null. 

The  deriver  should  not  modify  this  field. 
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There  are  three  more  f;.  The  first  is  responsible  for  comparing  relatives  with  p,  the 
second,  puts  all  the  pieces  together,  and  the  last  is  used  when  a  deriver  is  likely  not  to 
terminate. 

0/4=  compare-relative:  dossier  dossier  kind-derivative-sequence  — *  boolean  t±)  hint 

—  The  first  argument  is  for  p;  the  second,  for  a  relative,  call  it  p'. 

—  Let  ({k,  computable-di^))^ic^)^ki»i/uU-ni*ctt-teiutnee  denote  the  value  of  the  kind- 
derivative-sequence  argument  (where  kind/uid-request-sequence  is  the  result  of 
fi).  Each  computable-di^  is  the  derivative  string  of  p'  for  kind  k  (supplied  by  the 
job  server).  These  derivative  strings  might  be  null  (they  are  not  computed,  only 
looked  up),  but  not  all  of  them  are  null,  or  else  this  function  would  not  be  called 
on  this  p'. 

—  A  result  of  true  means  that  output  derivative  strings  of  p'  are  valid  derivative 
strings  of  p.  A  result  of  false  means  not  only  that  the  derivative  strings  of  p'  are 
not  valid,  but  also  that  there  is  nothing  in  the  extant  derivative  strings  of  p'  that 
is  of  use  in  the  derivatives  of  p.  A  result  that  is  a  hint  has  some  useful  information 
for  derivation.  It  will  be  passed  along  to  the  next  function.  A  hint  is  an  arbitrary 
data  structure  of  the  deriver  writer’s  choosing.  It  must  incorporate  everything 
that  can  be  gleaned  from  p',  because  p'  itself  will  not  be  seen  again. 

—  This  function  is  called  only  when  either  the  bodies  are  different  (not  necessarily 
the  contents)  or  the  derivative  strings  of  references  are  different  (as  strings,  not 
necessarily  inequivalent  in  the  sense  of  section  3.1);  otherwise,  the  job  server 
knows  that  the  answer  should  be  true. 

—  The  default  result  is  false  (useful  when  a  relative’s  derivatives  are  valid  only  when 
the  relative’s  body  and  derivatives  are  identical,  and  are  otherwise  useless). 

o  fi=  derive:  dossier  kind-derivative-set-sequence  hint-set  — »  derived-results 

—  The  dossier  is  for  p. 

—  Denote  the  kind-derivative-set-sequence  argument  by  {{derive-km,  Dm)}m-  Each 
derive-km  is  a  A:  such  that  {k,x)  €  kind/uid-request-sequence  and  for  which  no 
derivative  string  was  available  from  relatives.  Each  Dm  is  a  (possibly  empty)  set 
of  derivative  strings  of  kind  derive-km  of  relatives  of  p.  If  possible,  an  element  of 
Dm  should  be  used  in  the  result.  You  may  think  of  this  as  a  universal  deriver- 
supplied  hint  (as  opposed  to  a  hint,  which  is  extender-provided). 

—  The  hint-set  is  constructed  from  calls  on  the  previous  function.  (In  languages 
that  represent  sets  as  ordered  sequences  or  lists,  the  elements  are  in  no  particular 
order.) 

—  derived-results  is  {derive-dm}m.  This  is  a  sequence  of  derivative  strings  parallel  to 
kind- derivative-set- sequence. 

—  Deriver  requests  for  uids  and  filenames,  streams  or  standard  text  handles  associ¬ 
ated  with  them  should  be  made  through  the  functions  described  in  4.6. 
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-  This  function  must  be  supplied;  it  has  no  default  alternative.  However,  it  is  not 
called  if  all  derivatives  can  be  computed  incrementally. 

o  /e  =  abort-job-hook:  abort-job-data  kind-sequence  — » 

—  The  deriver  writer  should  supply  this  function  exactly  when  the  deriver  cannot 
guarantee  termination,  but  needs  to  return  an  error  derivative  (e.g.  derivers  that 
run  LaTeX  or  COMMON  LiSP  compilers). 

-  This  function  executes  asynchronously  during  compare-relative  (Z^)  or  derive  (/s), 
when  a  user  executes  a  M-x  abort- job  command  on  the  derivation. 

-  abort- job- data  is  a  generic,  derivation-local,  pointer  maintained  by  the  worker 
kernel.  It  is  set  by  calling 

set-abort- job-data:  generic-pointer  —* 

which  copies  generic-pointer  to  abort-job-data.  (An  error  is  signaled  if  set-abort- 
job-data  is  called  when  no  /e  is  supplied,  or  in  any  function  other  than  compare- 
relative  or  derive.) 
abort- job- data  can  be  read  with 

get-abort- job-data:  — »  generic-pointer 

-  kind-sequence  is  {k){k,T)^kind/%u-Tti*t$i-$ttutncf,  where  kind/uid-request-sequence  is  «is 
described  in  /j. 

-  The  only  way  that  /e  can  affect  the  derivation  is  to  either 

1.  use  set-abort-job-data  and  return  normally,  or 

2.  use  any  language-specific  non-local  return. 

-  If  /e  is  not  supplied  or  if  set-abort -job-data  heis  not  been  called,  then  M-x 
abort- job  causes  the  derivation  to  return  aborted  derivatives: 

“BJob  aborted  by  user. 

4.5.1  Reporting  Errors 

There  are  three  independent  means  by  which  a  deriver  might  indicate  a  problem  with  deriva¬ 
tion: 

•  It  might  return  a  bad  derivative. 

•  It  might  return  a  non-empty  derivative  string  of  class  error. 

•  It  might  call  some  worker-kernel-supplied  error  function.  This  method  should  be  re¬ 
served  for  reporting  errors  that  cannot  be  caused  be  erroneous  user  input  (in  other 
words,  for  errors  that  “should  not  happen”). 

Many  errors  are  handled  by  the  job  server,  and  require  no  action  by  the  deriver.  The  job 
server  handles  errors  due  to 
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•  non-existence  of  a  suitable  deriver, 

•  catastrophic  worker  failure  (see  below  for  implementation-language-specific  details), 

•  a  bad  requisite  or  reference  derivative,  and 

•  requested  kinds  that  are  not  among  the  computable  kinds. 

Whenever  a  deriver  returns  an  error  derivative  or  a  bad  derivative,  the  job  server  posts  a 
notice  to  the  job’s  requesters.  The  type  of  the  notice  is  derive  error.  Displaying  a  notice 
with  this  type  will  invoke  examine-problems  on  the  artifact  and  relevant  derivatives. 

The  deriver  must  report  all  other  errors,  e.g.  parse  errors  in  a  program-text  artifact.  The 
sequel  concerns  details  and  strategies  for  reporting  problems  or  errors  to  the  E-L  users  using 
one  of  these  methods. 

Bad  Derivatives  As  defined  in  Section  3,  a  bad  derivative  is  one  whose  derivative  string 
begins  with  C-b.  A  completely  bad  derivative  string  has  only  one  C-b,  namely  the  first.  The 
rest  of  the  string  is  completely  uninterpreted  by  the  system  but  appears  in  automatically 
issued  notices. 

A  partially  bad  derivative  is  a  bad  derivative  of  the  form 

"  ^message"  ^usuaU  derivative 

The  system  does  not  interpret  message,  but  uses  it  for  displaying  notices  to  the  user  (see 
below).  However,  usual-derivative  must  conform  to  the  format  for  this  derivative  class  as 
registered  in  the  derivative-classes  plex.  The  system  recognizes  the  contents  of  usual- 
derivative  for  the  purposes  of  garbage  collection  just  as  it  does  an  ordinary  good  derivative. 

Error  Derivatives  A  deriver  may  return  a  non-empty  derivative  string  of  class  error 
provided  that  its  ability  to  do  so  has  been  registered  in  the  derivative-classes  plex. 

This  is  the  most  flexible  method  of  reporting  an  error,  since  any  information  can  be 
stored  in  the  contents  of  the  error  derivative.  Many  kinds  of  user  support  can  be  provided 
in  the  function  that  displays  the  error  derivative  for  a  particular  artifact  type.  Consult  the 
derivative-classes  plex  for  the  allowable  format  of  error  derivatives. 

Dispatch  for  display  of  non-empty  error  derivatives  is^  by  artifact  type  alone. 

There  is  a  subtle  point  to  be  observed  about  error  class  derivatives.  A  good  error 
derivative  must,  like  all  other  derivatives,  functionally  depend  on  the  contents  of  the  artifact 
from  which  it  is  derived,  or  on  the  derivatives  of  the  artifact’s  references.  Thus,  if  the 
error  being  reported  is  due  to  a  system  resource  limitation,  or  is  otherwise  not  dependent 
on  the  contents  of  any  artifact,  the  error  derivative  should  be  partially  bad.  The  user  is 
then  properly  alerted  that  deleting  the  derivative  may  destroy  unrecoverable  information. 
Furthermore,  a  partially  bad  error  derivative  string  will  not  be  re-used  as  a  derivative  for 
successor  artifacts,  nor  will  it  be  used  by  related  derivatives. 

^In  the  near  term,  at  least. 
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It  is  up  to  the  deriver  writer  to  decide  what  combination  of  bad,  partially  bad  and  error 
derivatives  to  return  when  a  derivation  problem  arises.  Returning  a  partially  or  completely 
bad  derivative  is  suitable  only  when  it  is  acceptable  to  abort  jobs  for  which  the  job  reporting 
the  error  is  a  prerequisite. 

4.6  Language-Independent  Utilities 

The  following  utilities  are  provided  in  both  C  and  LlSP  worker  kernels. 

•  uniform-handle-kinds  first  last  kind  — »  handle-kind-sequence 

—  Return  a  sequence  of  handle-kind  pairs  of  the  form  {{i,  . . . last- 

—  Signal  an  error  if  first  <  0. 

—  Return  an  initialized,  empty  sequence  if  last  <  first. 

—  In  practice,  often  called  by  a  rtference-handle-kinds  function. 

•  derivative-path  kind  filename  — *  path 

*  Return  a  fully-qualified  path  name  for  a  file  holding  derived  information  for 
kind  and  having  the  given  filename. 

•  derivative-area  kind  —*  path 

*  Return  the  full  path  name  for  the  directory  holding  derived  information  files 
for  the  given  kind. 

•  stream-and-uid  flag  area  extn  —*  stream  uid 

—  Return  a  stream  open  for  writing  in  a  directory  determined  by  flag  and  area. 

—  If  flag  is  0  then  area  is  interpreted  as  a  derivative  kind  and  the  file  associated 
with  stream  is  in  the  directory  for  that  kind.  If  flag  is  1,  area  is  interpreted  as  an 
artifact  type  name,  and  the  associated  file  is  in  the  directory  for  that  type. 

—  The  file  associated  with  the  returned  values  is  guaranteed  to  be  previously  unused 
in  the  associated  directory. 

—  uid  is  the  file  name  without  extension  or  leading  path. 

—  If  extn  is  null  then  the  name  of  the  associated  file  is  uid,  a  string  of  digits. 

—  For  all  other  values  of  extn  the  associated  filename  is  uid  followed  by  a  and 
then  the  value  of  extn. 

•  f  ilename-and-uid^o^  area  extn  — »  filename  uid 

—  Like  stream-and-uid  except  that  the  first  value  returned  is  the  name  of  the 
associated  file. 

—  filename  is  not  a  fully-qualified  path  name,  but  (can  be  used  as)  the  name  of  a 
file  in  the  associated  directory. 
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—  Use  this  function  only  if  you  want  the  filename  for  purposes  other  than  simply 
opening  a  stream  on  it.  It  is  more  efficient  to  use  stream-and-uid  than  to  open 
a  stream  on  a  file  name  generated  by  f  ilename-and-uid. 

The  functions  stream. and.uid  and  sth_and_uid  are  more  appropriate  for  the 
latter  purposes,  since  they  more  efficiently  check  for  the  file’s  prior  existence  when 
opening  it. 

4.7  How  to  Write  a  Deriver  in  C 

To  write  a  deriver  in  C,  you  need  to  know  (in  addition  to  the  material  in  the  previous  sec¬ 
tions),  the  precise  types  and  calling  conventions  for  the  functions  that  constitute  the  deriver, 
how  to  manipulate  the  values  with  various  relevant  utilities,  how  to  build  an  executable  that 
can  be  installed  as  a  worker,  and  how  to  debug  your  new  deriver. 

4.7.1  How  to  Create  or  Extend  a  C  Worker  Executable 

As  usual,  we  begin  at  the  end,  with  a  description  of  what  to  do  once  you  have  actually 
written  your  deriver.  You  then  create  a  C  worker,  as  a  Unix-program  artifact.  For  example: 

Example  C  worker 

-  Targets  — -  "  — - - 

sun4-dbg  [cc  (universal  artifact)]  -g 
C  (cc  (universal  artifact)]  -g  -I/e-l/lib/include 

- Modules /Instructions - 

[Example  C  worker  initialization  module] 

[Example  C  worker  implementation] 

-L/e-l/lib/sun4  -Iworker  -lantique  -Imodern 

A  line-by-line  explanation: 

•  The  first  line  is  the  caption. 

•  The  entry  in  the  Targets  section  (which  could,  in  general,  be  one  of  several)  contains 
compilation  switches  that  cause  symbols  to  be  saved  for  debugging  (-g)  and  that  guide 
the  compiler  to  some  header  files  outside  the  artifacts  system  (-I/e-l/lib/ include).^ 

•  The  Modules/Instructions  section  refers  to  one  module  (in  this  C2ise,  the  first)  that 
contadns  the  worker  initialization  function,  whose  name  must  be  extenders.init. 
This  parameter  less,  resultless  function  must  include  the  initialization  code  for  each 
tool  in  the  worker.  Normally,  this  is  done  by  references  to  C-source  artifacts,  so  that 
the  actual  code  can  also  be  referred  to  in  documentation.  This  initialization  module  is 
written  by  the  extender,  but  may  be  patterned  after  extant  examples. 

®  In  the  future,  all  dependences  on  files  outside  the  artifacts  system  will  be  represented  via  artifact 
references.  E-L  now  has  all  the  machanisms  to  make  this  possible,  but  not  all  of  its  source  code  has  been 
converted  to  use  them. 
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•  The  remaining  modules  (here  only  one)  ha'  e  the  code  for  the  various  derivers  in  the 

worker. 

•  The  final  line  consists  of  switches  that  cause  the  worker  to  be  linked  with  objects  that 
outside  the  artifacts  system.® 

The  initialization  module  for  the  example  worker  registers  its  only  deriver,  which  needs  to 
supply  only  one  function. 

Example  C  worker  initialization  module 

-  Targets - 

- Source - 

[stdio.h]<8pec> 

[deriver.h]<8pec> 

[drvr-utils.h]<spec> 

[Example  C  worker  implementation]  <spec> 

gio^  void 
^  eztender8_init() 

{ 

«nt«r.deriver("my-type/exainple", 

NULL,  /*  computable-kinds  */ 

NULL,  /*  requisite-kinds  */ 

NULL,  /*  reference-handle-kinds  */ 

NULL,  /*  compare-relative  */ 

example.deriver , 

NULL,  /*  abort -job-hook  ♦/ 

1);  /*  job-server-incremental  (==  TRUE)  */ 

} 

The  example  tool  does  nothing  but  produce  a  constant  derivative. 

Example  C  worker  implementation 

-  Targets - 

- Source - 

[deriver.h]<spec> 

[drvr-utils.h]<spec> 

[derive  function  for  example  kind] 

In  a  more  complex  deriver,  there  would  be  a  reference  to  each  of  the  deriver  functions  (e.g., 
the  compare-relative  function),  and  there  would  probably  be  supporting  functions  as  well. 
The  prerequisite  artifacts  [deriver. h]  and  [drvr-utils.h]  supply  necessary  type  and  function 
declarations.®  (See  the  next  section  for  more  on  this  subject.) 

®At  present,  these  are  universal  artifacts  connecting  to  header  files  outside  the  artifacts  system  When 
the  conversion  of  E-L’s  source  code  is  complete  (see  footnote  5),  these  will  be  replaced  by  references  to 
ordinary  module  artifacts. 
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Finally,  the  one  necessary  function: 

derive  function  for  example  kind 
glo^  string.sequence.type  * 

I  example.deriver (dossier.  relative.k.D.pairs,  hint. list) 

I  dossier.type  edossier; 

I  k.D.sequence.type  *relative_k_D.pairs; 

^  generic.pointer.sequence.type  *hint_list; 

return  singleton.derivative.sequenceC'some  derivative"}; 

} 

See  Section  4.6  for  singleton.derivative.sequence. 

4.7.2  Types  and  Calling  Conventions 

We  first  describe  the  correspondence  between  C  types  and  those  used  in  the  language- 
independent  description  of  derivers.  For  C  types  with  “sequence_type”  in  their  names 
and  for  “string_sequence2_type”,  see  the  chapter  on  sequences  in  Artifacts  System  C 
Utilities.  The  list  below  gives  the  type  of  an  element  of  the  sequence. 

•  body — char  * 

•  boolean  W  hint — boolean.or.generic.pointer  ♦;  cast  your  hint,  or  the  values  TRUE 
or  FALSE,  to  this  type. 

•  dossier — dossier.type  *,  where  the  struct  has  the  following  types/fields: 

—  char  ♦body 

—  generic.pointer  contents 

—  k.d.sequence.type  *requisite_k_d 

—  h.k.d. sequence. type  ♦reference.h.k.d 

e  handle-kind-derivative-sequence — h.k.d.sequence.type  *;  elements  have  type 
h.k.d.type,  which  has  fields: 

—  int  hzmdle 

—  char  ♦kind 

—  char  ♦derivative 

•  handle-kind-sequence — h.k.sequence.type  ♦;  elements  have  type  h.k.type,  which 
has  fields: 

—  int  handle 

—  char  ♦kind 
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•  hint-set — generic.pointer.sequence.type  •;  elements  have  type 
generic.pointer. 

•  integer — int 

•  kind-derivative-sequence — k_d_sequence_type  ♦;  elements  have  type  k_d_type, 
which  has  fields: 

—  char  *kind 

—  char  *derivative 

•  kind-derivative-set-sequence — k_D_sequence_type  ♦;  elements  have  type  k_D_type, 
which  has  fields: 

—  char  *kind 

—  string_sequence_type  ♦derivative_set 

•  kind-sequence — string_sequence_type  ♦;  elements  have  type  char  *. 

•  kind/uid-request-sequence  — 

typedef  struct  k_ur_sequence_type  { 
k_ur_type  ♦elements; 
int  length; 

}  k.ur.sequence.type; 

where  elements  have  the  type 

typedef  struct  k.ur.type  { 
char  ♦kind; 
int  uid.count 
}  k_ur_type; 

•  location — generic.pointer  ♦  (See  Artifacts  System  C  Utilities.) 

•  uids — string_sequence2_type  ♦ 

The  calling  conventions  for  the  C  functions  are  exactly  analogous  to  those  in  the  language- 
independent  specification,  using  the  corresponding  C  types.  The  only  slight  subtlety  involves 
location  arguments,  whose  type  is  generic.pointer  ♦.  With  both  location  and  hint,  the 
extender  will  likely  cast  the  generic  pointer  to  some  more  specific  data  type. 
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4.7.3  Relevant  Utilities 

The  document  entitled  Artifacts  System  C  Utilities  contains  many  useful  types  and  functions 
that  are  available  to  the  author  of  tools  in  C.  Chapter  2  of  that  document,  “Tool  Writer 
Facilities”,  has  interface  descriptions  for  the  various  utility  functions  (and  other  program 
objects)  mentioned  but  not  defined  throughout  this  section.  Read  Chapter  1  for  a  instruc¬ 
tions  on  incorporating  the  utilities  into  your  derivers.  (There  should  be  no  need  to  consult 
Chapter  3,  “System  Extender  Facilities”.)  If  you  don’t  find  what  you  need  defined  either  in 
this  document  or  Chapter  2  of  Artifacts  System  C  Utilities,  either  appeal  to  the  editor  of 
the  latter  or  write  it  yourself. 

Of  particular  interest  are  functions  that  allocate  “job-persistent”  storage,  which  is  heap 
storage  that  is  registered  to  be  garbage-collected  when  the  C  worker  returns  to  its  lowest- 
level  listening  loop,  i.e.,  when  all  jobs  it  has  been  engaged  in  are  done.  Since  tools  rarely 
need  heap  data  that  lives  longer  than  the  current  job,  the  tool  writer  is  freed  from  worries 
about  reclaiming  storage. 

Chapter  2  also  contains  import<int  information  about  exception  handling,  accessing  deriv¬ 
ative  files,  reading  and  manipulating  artifact  contents  and  using  regular  expression  searches. 
There  are  also  definitions  of  a  variety  of  useful  data  structures,  including  sequences,  hash 
tables,  and  standard  text  handles  (variable-length  arrays  of  characters). 

The  rest  of  this  section  describes  utilities  of  only  to  C  deriver  writers.  Some  are  mentioned 
in  examples  and  algorithms  elsewhere  in  this  document.  These  functions  must  be  used  in 
the  context  of  a  <spec>  reference  to  the  universal  artifact  called  drvr-utils  .h.  (See  the 
example  deriver  above.) 

•  h_k_sequence_type  ♦uniform.handle.kinds (first,  last,  kind) 
int  first,  last; 

char  *kind; 

•  char  ♦derivative_path(kind,  fileneune) 
char  ♦kind,  ♦uid; 

•  char  ♦derivative_area(kind) 
char  ♦kind; 

—  See  the  corresponding  l^lnguage-independent  functions  in  Section  4.6. 

•  FILE  ♦stream_and_uid(flag,  area,  extn,  uid) 
int  flag;  char  ♦area,  ♦extn,  ♦♦uid 

—  See  stream-and-uid. 

—  The  result  is  stream. 

—  uid  must  be  the  non-null  address  of  a  char  ♦,  in  which  sth_and_uid  places  the 
uid  result  of  stream-and-uid. 

•  char  ♦f ilename_and_uid(flag,  area,  extn,  uid) 
int  flag;  char  ♦area,  ♦extn,  ♦♦uid 
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-  f  ilenzune.and.uid  is  to  f  ilename-and-uid  (section  4.6)  as  stream_and_uid  is 
to  stream-and-uid. 

—  Use  this  function  should  only  if  you  want  the  filename  for  purposes  other  than 
simply  opening  a  stream  on  it.  The  functions  stream_and_uid  and  sth_and_uid 
(below)  are  more  appropriate  for  the  latter  purposes,  since  they  more  efficiently 
check  for  the  file’s  prior  existence  when  opening  it. 

•  8th_t]rpe  8th_and_uid(flag,  area,  extn,  uid) 
int  flag;  char  ■^area,  *extn,  **uid 

—  Returns,  in  gc’ed  storage,  an  sth.type  in  a  directory  determined  by  flag  and 
area.  The  other  arguments  are  as  for  stream-and-uid. 

•  char  *new_derivative_f  ilenaneCkind,  extn) 
char  ♦kind,  ♦extn; 

—  OBSOLETE.  Use  f  ilename.and.uid. 

—  For  backward  compatibility,  if  extn  is  DEFAULT.EXTN  then  the  filename  will  have 
any  file  extensions  appropriate  for  the  kind.  This  usage  is  deprecated. 

-  Returns  the  uid  corresponding  to  filename  in  newly  allocated,  gc’d  storage  The 
uid  is  a  string  representing  a  decimal  integer. 

-  Returns  NULL  if  it  can’t  find  a  uid  in  filename. 

—  Simply  strips  off  UNIX  filename  extensions  and  checks  that  what’s  left  looks  like  a 
uid.  Although  generality  might  suggest  otherwise,  for  sanity  checking  it  assumes 
that  the  extension  begins  with  and  includes  the  first  period  (‘.’)  in  filename. 

e  int  chvdColdpath,  nevpath) 
char  ♦oldpath,  ♦newpath; 

—  If  nevpath  is  not  NULL,  changes  the  current  UNIX  working  directory  to  newpath 
using  chdir. 

—  Puts  the  previous  working  directory  (obtained  with  getwd)  into  oldpath,  unless 
nevpath  is  NULL,  when  it  does  nothing. 

—  The  caller  is  responsible  for  allocating  enough  storage  (at  least  MAXPATHLEN  char¬ 
acters)  for  oldpath. 

—  Returns  0  on  successful  completion,  or  -1  if  newpath  is  null. 

—  Signals  an  error  if  chdir  encounters  a  problem. 

e  string_sequence_type  ♦singlet on.derivative.sequence (derivative) 
char  ♦derivative; 

—  Returns  a  sequence  with  exactly  one  string  element,  a  newly  allocated,  automat¬ 
ically  garbage-collected  copy  of  derivative. 
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•  boolean  same.f ile_contents(pathl,  path2,  directory) 
char  *pathl.  ♦path2,  *directory; 

—  Returns  TRUE  if  the  contents  of  the  files  given  by  pathl  and  path2  are  identical. 

—  The  path  names  are  taken  relative  to  directory  if  it  is  not  NULL.  If  directory  is 
NULL,  the  path  names  are  taken  relative  to  the  current  working  directory  (unless 
they  are  fully  qualified). 

—  Using  directory  or  path  names  relative  to  the  current  working  directory  (rather 
than  fully  qualified  path  names)  considerably  speeds  up  the  comparison. 

—  For  example,  to  compare  two  derivatives  in  uidl  and  uid2  of  kind  manuscript, 
use 


same.f ile.contents (uidl,  uid2,  derivative_area("m2Uiuscript") ) 

—  Signals  an  error  if  either  file  does  not  exist. 

•  int.sequence.type  *related_references(old_dossier ,  old.haindles, 

new.dossier,  nev.hemdles) 

dossier.type  *old_dossier,  ♦new.dossier; 
int.sequence.type  ♦old.handles,  ♦new.hamdles; 

—  Compares  subsets  of  the  references  of  two  related  artifacts,  returning  a  map  from 
the  reference  subset  of  the  younger  artifact.  The  map  indicates  which  member 
of  the  older  artifact’s  reference  set  each  younger-artifact  reference  is  related  to,  if 
any. 

—  The  inputs  are  dossiers  for  the  two  artifacts  and  sequences  containing  handles 
relative  to  each.  The  output  is  a  sequence  of  integers  parallel  to  the  nev.handles 
sequence;  each  entry  is  either  a  handle  from  the  sequence  old.handles  or  -1, 
which  means  there’s  no  relationship. 

—  An  artifact  is  related  to  another  if  0  or  more  artifact  predecessor  links  lead  from 
the  first  to  the  second.  Thus,  among  other  instances  an  artifact  is  related  to  itself, 
and  also  to  its  immediate  predecessor  (if  there  is  one).  Notice  that  the  definition 
is  not  symmetrical  —  for  the  purpose  of  this  function,  relatedness  goes  backward 
in  time. 

•  char  ♦compare.files(new_file,  file.set) 
char  ♦!  j.file; 

string_sequence_type  *file_set; 

—  Compares  the  file  named  by  new.f ile  to  each  of  the  files  in  file.set. 

—  The  files  named  may  be  a  mix  of  paths  relative  to  the  current  working  directory  or 
absolute,  fully-qualified  path  names.  However,  since  the  mention  of  any  absolute 
path  names  seriously  affects  efficiency,  it  is  best  to  confine  all  the  files  to  paths 
relative  to  the  current  working  directory. 
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—  If  some  element  of  file.set  has  byte-by-byte  the  same  contents  as  new_f  ile, 
deletes  new.file  and  yields  that  element.  Otherwise,  yields  new_f  ile. 

—  The  string  returned  is  shared  either  with  new_f  ile  or  with  one  of  the  elements 
of  file.set. 

—  In  typical  use  the  nev.f  ile  was  just  created  in  the  process  of  derivation.  The 
file.set  names  old  files  which  are  derivatives  of  relatives. 

—  Uses  same.f  ile.contents  to  do  the  comparisons. 

e  string_sequence_type  *  derivative.set.nev.gc 

—  Return  a  new,  empty,  initialized  set  of  derivatives  in  gc’d  storage. 

•  string_sequence.type  * 

select_derivatives(k.D.seq,  kindi,  di ,  ....  kindn ,  dn,  (char*)NULL) 
kind.derivative.s9t.sequence.type  ♦k.D.seq; 
char  *kindi,  *d\,  ....  *kindn,  *<4; 

—  Return  a  subsequence  of  the  d,  that  is  parallel  to  the  kinds  in  kind.derivative. 
set. sequence. 

—  kind.derivative.set.sequence  has  type  kind.derivative.set. sequence  * 

—  Each  ki  and  di  is  a  char*;  each  ki  is  a  derivative  kind  and  each  di  is  the  corre¬ 
sponding  derivative. 

#  k_ur.sequence.typek.ur_seqh.make_gc(n,  kindi,  count\ ,  ...,  kindn,  countn) 
int  n,  counti ,  countn’, 

char  *kindx ,  ....  *kindn ; 

—  Construct  a  kind/uid- request  sequence  of  length  n  using  each  kindi  as  a  kind  and 
each  count,-  as  the  corresponding  uid-request  count. 

*  area.type  One  of  TYPE  or  KIND.  Values  are  used  by  the  following  function  to  distinguish 
between  artifact  contents  files  (TYPE)  and  derivative  files  (KIND). 

•  char*  resource.path (nature,  subarea,  filename) 
area.type  nature;  char*  subarea,  *filename; 

—  Construct  the  repository-relative  path  to  subarea/filename,  given  the  nature 
of  subarea,  which  is  either  a  TYPE  string  or  a  KIND  string.  A  repository-relative 
path  is  one  whose  implicit  root  directory  is  that  given  in  the  environment  variable 
E.L.REPOSITORY. 
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4.7.4  How  To  Install  A  Worker  Developed  in  E-L 

Here  is  a  recipe  for  creating  a  new  worker  from  a  uniz-program  artifact  developed  as  de¬ 
scribed  in  section  4.7.1. 

1.  Determine  the  kind  of  derivative  to  export  and  install  by  selecting  one  of  the  taigel 
lines  of  the  program  artifact.  If  its  machine- variant  combination  is  sun4-dbg,  as  in 
the  example  of  section  4.7.1,  then  the  derivative  kind  is  executable%sun4-dbg. 

2.  Choose  a  directory  in  the  file  system  to  hold  the  executable,  and  a  file  name  for  it. 
Together  these  form  a  fully  qualified  path  name  at  which  the  worker  will  be  stored. 

3.  If  the  tools  in  your  worker  develop  new  classes  of  derivatives,  make  appropriate  entries 
in  the  derivative-classes  plex,  as  described  in  section  3.2. 

4.  Enter  the  tools  of  the  worker  in  the  derivers  plex  (section  4.4). 

5.  Select  a  logical  name  for  your  worker.  It  need  not  be  the  same  as  the  file  name  under 
which  it  is  stored.  Make  an  entry  in  the  workers  plex,  using  the  logical  name  and  the 
fully  qualified  path  name  from  step  2.  See  section  4.3  for  more  details  concerning  the 
workers  plex. 

6.  From  El-L,  use  the  M-x  export -derivative  command  to  export  an  executable  of  the 
kind  selected  in  step  1.  The  command  will  prompt  you  first  for  the  program  artifact, 
and  then  for  the  file  name  where  you  want  it  to  reside.  Give  it  the  fully  qualified  path 
name  from  step  27 

7.  Use  the  UNIX  command  chmod  to  make  the  newly  created  file  executable. 

Your  worker  should  now  be  installed  and  ready  to  handle  jobs. 

4.7.5  Debugging  A  Deriver 

This  section  describes  some  strategies,  tools,  and  pitfalls  associated  with  debugging  a  deriver 
offered  by  a  C  worker.  Ultimately,  the  methods  described  here  will  be  superseded  by  an 
approach  that  is  more  integrated  with  E-L. 

Here  in  outline  is  the  basic  procedure  for  debugging  a  worker.  Details  follow. 

1.  Follow  the  steps  in  section  4.7.4  to  create  the  worker  executable  and  export  it  to  the 
UNIX  file  system. 

2.  Set  up  the  derivers  and  workers  plexes  to  specify  the  worker  you  wish  to  debug. 

3.  Using  M-x  dbx  within  E-L,  or  using  your  favorite  debugger  outside  the  system,  start 
a  debugging  session  on  your  worker. 

^Notice  that,  despite  its  name,  the  export -derivative  command  is  not  fully  general.  It  applies  only  to 
certain  derivatives  that  have  files  associated  with  them,  e  g.  the  executable  derivatives  of  unix-program 
artifacts.  For  other  artifact  types  and  derivative  kinds,  beware. 
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4.  Start  a  job  server  as  an  inferior  process  to  the  worker.  This  job  server  will  schedule 
jobs  for  the  test  worker. 

5.  Commit  artifacts  or  use  M-x  derive-from  to  schedule  derivations  on  the  test  worker 
from  your  El-L  session.  You  may  watch  progress  in  the  present  plex. 

6.  When  you  are  done  debugging,  shut  down  the  job  server  from  the  worker,  and  exit  the 
worker  and  debugger. 

If  your  worker  handles  derivatives  also  handled  by  existing  workers,  and  there  are  others 
using  your  repository,  your  jobs  may  interfere  with  those  scheduled  by  others.  For  example, 
a  derivation  you  scheduled  to  test  your  worker  may  go  to  another  worker,  or  your  worker 
may  get  jobs  scheduled  by  others.  Shutting  down  competing  job  servers  may  help,  but  it 
may  also  make  other  users  unhappy.  It’s  best  to  test  only  when  the  system  is  otherwise 
quiet.® 

Notation.  We  use  *h  as  the  UNIX  shell  prompt  and  (dbx)  for  the  debugger  prompt. 

Invoking  the  debugger.  A  worker  developed  in  C  within  E-L  can  be  debugged  within 
E-L;®  just  apply  the  M-x  dbx  command  to  the  unix-progrzun  artifact  for  the  worker. 

On  the  other  hand,  you  can  also  debug  a  worker  outside  of  E-L.  To  provide  the  right 
environment  for  executing  a  worker,  however,  you  must  use  the  e-1  script  to  start  the 
debugger.  If  your  worker  is  my-worker,  execute  the  following  shell  command: 

X  e-1  -do  dbx  my-worker 

(The  debugger  we  use  in  the  examples  is  dbx,  but  similar  remarks  apply  to,  say,  gdb.) 

Next,  you  should  set  breakpoints  at  strategic  places  in  your  code,  for  example  at  each 
top-level  derivation  function.  You  may  also  set  breakpoints  at  the  following  functions  to 
catch  faults  or  exceptions  raised  by  the  error  and  warning  functions: 

(dbx)  stop  in  server.fault 
(dbx)  stop  in  server_error 
(dbx)  stop  in  server.warning 

Once  you  have  set  your  debugging  environment,  execute 

(dbx)  run  -i 

to  start  the  worker  with  its  standard  input,  output  and  error  output  set  to  your  terminal. 
(The  -i  switch  tells  the'  worker  nucleus  to  parse  RPCs  on  a  one-per-line  basis,  so  that  you 
can  drive  the  worker  interactively.  In  normal  operation,  RPCs  are  packaged  for  efficient 
transmission  between  processes.) 

*There  will  soon  be  an  easy  way  to  set  up  additional  repositories  to  allow  debugging  without  interference. 
®So  it  is  unfortunate  that  you  need  to  export  the  program  before  debugging  begins  simply  in  order  to 
register  it  in  the  workers  plex.  This  will  be  changed  soon.  For  now,  you  can  export  once  before  testing  your 
worker,  then  debug  within  E)-L,  and  finally  export  one  more  time  to  put  your  final  worker  into  production. 
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Starting  an  inferior  job  server.  Once  your  worker  is  started,  you  need  to  start  a  job 
server  that  will  send  derivation  jobs  to  your  worker.  This  is  done  by  starting  a  job  server 
inferior  to  the  worker.  If  you  have  started  the  worker  as  in  the  previous  paragraph,  it  will 
be  waiting  for  input  from  your  terminal.  The  worker  interprets  each  line  you  enter  at  your 
terminal  as  an  RPC  (remote  procedure  call).  The  first  RPC  you  send  starts  the  inferior  job 
server.  Enter  the  line 

start-server  (.worker-logical-name  host) 

followed  by  a  carriage  return.  The  worker-logical-name  is  the  first  entry  in  the  current 
worker’s  workers-plex  line  (its  logical  name,  not  its  executable  file  name),  enclosed  in  double 
quotes,  jhostl  is  the  host  name  (also  enclosed  in  quotes)  of  the  machine  on  which  you  are 
executing  the  worker.  So  if  the  host  is  named  sam  and  your  worker  has  logical  name  e-1- 
latex-worker,  you  would  enter  the  line 

start-server  ("e-l-latex-worker"  "saun") 

After  you’ve  entered  the  RPC,  the  worker  will  type  out  the  process  id  of  the  job  server  and 
wait  for  further  RPCs.  Remember  this  “pid”  for  later.  You  should  soon  see  a  new  entry  for 
job  server  in  the  present  plex  of  your  E-L  session. 

To  stop  a  server  without  terminating  the  worker,  enter  the  following  line  to  the  waiting 
worker: 

stop-server 

The  stop-server  RPC  has  no  arguments — it  already  knows  about  the  inferior  job  server. 
If  you  are  waiting  at  a  debugger  prompt,  either  because  of  an  error  or  breakpoint,  you  can 
shut  down  the  worker  by  having  the  debugger  execute  the  following  call: 

stop_server_receiver (0 ,0) 

When  all  else  fails,  call  the  parameterless  C  function  kill.serverO  from  the  debugger 
prompt,  or  try  a  kill  -2  on  the  job  server’s  pid  from  the  shell;  these  shut  down  an  inferior 
server,  and  cause  any  jobs  it  is  working  on  to  be  rescheduled. 

To  stop  a  worker  in  its  idle  loop  (i.e.,  while  it  is  waiting  for  an  RPC  from  your  terminal) 
issue  the 

stop-worker 

RPC.  This  gracefully  shuts  down  any  inferior  server  and  then  exits  the  worker  program 
back  to  the  debugger. 
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4.8  How  to  Write  a  Deriver  in  Common  Lisp 

This  section  assumes  familiarity  with  the  clisp-nodule  artifacts  as  detailed  in  the  Language 
Users  ’  Manual.  To  add  a  deriver  to  a  new  or  existing  Lisp  worker,  perform  the  following 
steps: 

1.  Write  the  six  deriver  functions  described  in  section  4.5. 

•  Section  4.8.2  describes  the  Common  Lisp  versions  of  data  types  mentioned  in  4.5 
and  calling  conventions  for  the  deriver  functions. 

•  Section  4.8.3  describes  various  utilities  useful  for  writing  derivers. 

2.  Extend  an  existing  Lisp  worker  with  your  new  deriver,  or  create  a  new  one.  See 
section  4.8.1  for  details. 

3.  Debug  your  deriver  using  methods  described  in  section  4.8.4. 

4.8.1  How  to  Create  or  Extend  a  Lisp  Worker  Executable 

A  Lisp  worker  is  a  Unix  executable  obtained  from  a  cl isp-program  artifact  of  the  appropri¬ 
ate  form.  Here  is  an  example  (in  fact  the  only  Common  Lisp  worker  currently  supported — 
they  are  quite  large.) 

-  Targets  - 

8un31ucid4  [Lucid  4.0.1]  e-1/ : : standalone-lisp'worker 

- Modules/Instructions - 

[Root  module  of  Common  LiSP  worker] 

The  single  clisp-module  referenced  by  the  program  above  is  called  Iw-root.  The  extender 
can  either  edit  this  module,  and  thereby  alter  the  existing  Lisp  worker,  or  copy  it  to  create 
another  one  by  embedding  it  in  a  different  clisp-program.  This  root  module  deals  with 
some  bootstrapping  issues  that  need  not  concern  the  extender.  The  two  points  to  note  here 
are:  where  to  insert  a  reference  to  the  module  containing  a  new  deriver,  and  what  to  do 
about  the  Lisp  package  used  to  encapsulate  that  deriver.  Here’s  the  root  module: 

Root  module  of  Common  LiSP  worker 

-  Targets - — — — — — 

8un31ucid4  [Lucid  4.0.1]  [Safe  Compiling  Switches] 

- Source - — - - - 

(in-package  "E-L") 

[Utilities  for  bootstrapping  foreign  (C-coded)  object  files] 
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(let  ((♦load-if-source-newer*  : compile) 

(*load-il-8ource-only*  : compile)) 

[Kernel  for  any  Common  LiSP  worker]  <8pec> 

[Debugging  support  for  Common  LiSP  derivers]<8pec> 

[The  clisp-module/pf  a8l  cleriver]<8pec> 

(Other  deriver  modules  could  be  referenced  here.) 

[Rename  worker- kernel  and  deriver  packages]) 

[Create  a  worker  executable  when  bootstrapping] 

It’s  probably  a  good  idea  to  give  each  Common  Lisp  deriver  its  own  package.  Certainly  the 
deriver  should  not  be  in  package  USER.  Your  package  name  should  begin  with  E-L,  since 
such  package  names  are  reserved  to  the  worker  kernel  and  derivers.  Packages  E-L,  E-L/, 
E-L-CLISP,  and  E-L-CLISP/  are  already  used:  E-L  is  the  package  for  the  code  of  the  Lisp 
worker  kernel,  and  E-L-CLISP  is  that  of  the  clisp-module/pf  asl  deriver.  In  each  case,  the 
source  package  is  renamed  by  appending  a  /  when  the  worker  is  built,  cis  follows: 

Rename  worker-kernel  and  deriver  packages 
(when  (find-package  "E-L/")  (delete-package  "E-L/")) 

(rename-package  "E-L"  "E-L/") 

(when  (find-package  "E-L-CLISP/")  (delete-package  "E-L-CLISP/")) 

(rename-package  "E-L-CLISP"  "E-L-CLISP/") 

This  package  renaming  is  important  to  avoid  confusion  when  deriving  from  the  artifacts 
that  are  the  source  for  the  worker  kernel  and  the  deriver  itself.  If  your  deriver  operates  on 
Common  Lisp  code,  and  might  therefore  operate  on  its  own  source,  you  should  probably 
renzune  its  package  in  a  similar  way. 

The  function  that  registers  the  deriver  in  the  worker  can  occur  in  the  module  that  contains 
it;  for  example,  [The  clisp-module/pf  asl  deriver]  referenced  in  the  root  module  above 
contains  the  following  definition: 
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Register  deriver  functions  for  the  pf  asl  deriver. 
(•st«r-d«river 

(format  nil  "clisp-module/pfasl-'A'A" 
(get-machine) 

(string-dovncase  (lisp-architecture) ) ) 
*pfasl-fl  computable-kinds  function 
*pfa8l-f2  requisite-kinds  function 
’pfasl-f3  reference-handle-kinds  function 
’pfasl-f4  compare-relative  function 
’pfasl-fS  derive  function 
nil  abort-job-hook 

t)  job-server-incremental  flag 


Concoct  the  deriver  name 


Observe  that  the  deriver  functions  are  supplied  as  symbols.  When  the  deriver  is  executed,  the 
function  values  of  these  symbols  are  used.  This  is  much  better  for  debugging  than  entering 
actu2d  function  values,  because  tracing  requests  and  or  redefinitions  act  on  symbols,  not  on 
function  values. 


4.8.2  Types  and  Calling  Conventions 

This  section  describes  the  correspondence  between  Common  Lisp  types  and  those  used  in 
the  language-independent  description  of  derivers  in  section  4.5.  Recall  the  Common  Lisp  has 
typed  objects,  not  typed  variables. Thus  the  specifications  below  mean  that  the  objects 
passed  to  and  returned  from  the  relevant  functions  should  have  the  indicated  types. 

The  calling  conventions  for  the  Lisp  functions  are  exactly  analogous  to  those  in  the 
language-independent  specification  found  in  section  4.5,  by  way  of  the  type  correspondence 
below. 

Recall  that  the  Lisp  type  encompassing  all  objects  is  called  t.  In  the  following  list  we 
use  standard  Common  Lisp  type  specifiers**  whenever  possible. 

•  body  —  Lisp  type  string. 

•  boolean  (tl  hint  —  We  here  define  a  boolean  to  be  one  of  the  values  nil  or  t.  A  hint 
can  be  anything  the  deriver  writer  likes  except  one  of  these  values.  Thus  the  union 
type  is,  technically,  t. 

•  dossiei - A  dossier  is  a  list  with  the  following  accessors,  each  of  which  takes  a  single 

dossier  argument  and  returns  the  type  indicated: 

—  dossier-body  —  Lisp  type  string. 

—  dossier-contents  —  Lisp  type  t. 

—  dossier-requisite-k-d  —  a  kind-derivative-sequence  (see  below). 

—  dossier-reference-h-k-d —  a  handle-kind-derivative-sequence  (see  below). 


*°See  Chapter  2  of  Common  Lisp  The  Language,  2nd  edition,  for  details. 
*Mbid.,  Chapter  4. 
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•  filename  —  Lisp  type  string. 

•  kind  —  Lisp  type  string. 

•  location  —  a  list  with  a  single  element,  that  element  being  the  contents  of  the  artifact 
under  derivation.  Unless  changed  by  the  deriver,  the  element  is  nil.  Given  a  location 
loc,  the  deriver  writer  should  access  its  contents  using  (car  loc)  and  set  the  contents 
using  (setf  (car  loc)  . . . ) . 

•  handle-kind-derivative-sequence  —  Lisp  type  (list  h-k-d).  Each  h-k-d  is  a  list  of 
three  elements 

(.handle  kind  derivative) 

where  handle  has  type  f  ixnum  and  both  kind  and  derivative  have  type  string. 

•  handle-kind-sequence  —  Lisp  type  (list  h-k).  Each  h-k  is  a  list  of  two  elements 

(handle  kind) 

where  handle  is  a  f  ixnum  and  kind  is  a  string. 

•  hint-set  —  list.  This  requirement  that  a  set  be  implemented  as  a  list  may  be  odious 
to  some,  and  is  subject  to  revision. 

•  integei - f  ixnum 

•  kind-derivative-sequence  —  Lisp  type  (list  k-d),  where  each  k-d-set  is  list  of  two 
strings 

(kind  derivative) 

•  kind-derivative-set-sequence  —  Lisp  type  (list  k-d-set).  A  k-d-set  is  a  list  of  two 
elements 

(kind  derivative-set) 

where  kind  is  a  string  and  derivative-set  has  type  (list  string).  Thus  to  obtain 
the  derivative-set  of  a  k-d-set,  use  cdr. 

•  kind-sequence  —  (list  string). 

•  kind/uid-request-sequence  —  (list  k-ur)  A  k-ur  is  a  list  of  two  elements 

(kind  count) 

where  kind  is  a  string  and  count  is  a  f  ixnum. 

•  path  —  Lisp  type  string. 

•  uids — Lisp  type  (list  (list  string)) 
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4.8.3  Relevant  Utilities 

Section  4.6  describes  facilities  for  use  by  the  deriver  writer  and  provided  by  all  workers.  This 
section  describes  additional  facilities,  or  clarifications  to,  or  deviations  from  the  specification 
in  section  4.6.  In  general,  if  section  4.6  describes  a  function  as  returning  more  than  one 
value,  then  the  Lisp  version  returns  multiple  values  whenever  the  Lisp  dialect  provides  for 
them;  otherwise  the  function  returns  a  list  of  values.** 

I  enter-deriver  I  [function] 

Args:  tool  computable-kinds  requisite-kinds  reference-handle-kinds  compare-relative  derive 
abort-job  &optional  (job-server-incremental  t) 

Result  type:  no  result 

•  Register  for  tool  the  six  functions  defining  a  deriver  in  the  current  Lisp  worker. 

•  tool  should  be  a  string.  The  remaining  arguments  should  be  symbols,  nil  if  the 
default  functions  are  to  be  used. 

•  Obviously,  derive  cannot  be  nil. 


[  streajn-and-uid 

Args;  flag  area  &optional  (extension  nil) 
Result  type:  stream  string 

•  See  section  4.6  for  the  semantics. 


[function] 


•  flag  is  one  of  the  keywords  :derivative  or  : artifact,  rather  than  an  integer. 

•  The  result  is  a  Lisp  output  stream  and  a  string  giving  the  uid. 


I  f ilename-and-uid 

Args:  flag  area  &optional  (extension  nil) 
Result  type;  streaun  string 

•  See  section  4.6  for  the  semantics. 


[function] 


•  flag  is  one  of  the  keywords  :derivative  or  : artifact,  rather  than  an  integer. 

•  The  result  is  two  Lisp  strings  giving  the  file  name  and  the  associated  uid. 


I  bad-der i vat i ve 

Args:  message 
Result  type:  string 

*^Common  Lisp  implementations  return  multiple  values. 


[function] 
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•  Return  a  properly  constructed  bad  derivative  encoded  with  the  given  string  message. 


I  error-derivati^ 

Args:  summary  type  details  uid-list 
Result  type:  string 


[function] 


•  Return  an  error  derivative  using  the  strings  summary,  type,  details  and  the  list  of 
uids  in  uid-list. 


•  See  section  3.5  for  a  description  of  the  syntax  and  treatment  of  error  derivatives. 


[function] 

Args;  type  &optional  filename 
Result  type;  string 

•  See  the  corresponding  definition  in  Artifacts  System  C  Utilities  for  the  semantics. 

•  type  and  filename  are  Lisp  strings. 


artif act-contents-path 


I  with-stream-and-uid 

Args:  (stream  uid  flag  area  &optional  (extension  nil)  (abort  nil))  &body  body 
Result  type:  None 


[macro] 


•  Bind  stream  and  uid  as  the  results  of  a  call  to  stream-amd-uid,  and  execute  body  in  a 
manner  analogous  to  the  Common  Lisp  macro  with-open-f  ile. 

•  If  the  boolean  abort  is  not  null,  then  delete  the  file  bound  to  stream  upon  abnormal 
exit.  Otherwise  just  close  it. 

•  The  arguments  flag,  area  and  extension  have  the  same  meaning  that  they  do  in  stream- 
and-uid. 

•  Because  of  the  unwind-protection  one  should  prefer  this  macro  over  direct  uses  of 

stream-and-uid. 


4.8.4  Debugging  A  Deriver 

Debugging  a  Common  Lisp  worker  turns  out  to  be  rather  different  from  debugging  the  other 
workers.  The  main  issue  is  that  the  Common  Lisp  worker  is  multi-threaded. 

Initially  you  will  be  debugging  a  Common  Lisp  worker  assembled  in  an  interactive  Lisp 
Session.  Suppose  the  worker  name,  as  registered  in  the  workers  plex,  is  sun41ucid4,  and 
that  that  host  you’re  running  it  on  is  myhost.  You  can  start  an  inferior  job  server  under 
your  interactive  worker  by  calling  usual-start-server  with  the  strings  "sun41ucid4"  and 
"myhost"  as  arguments.  To  interrupt  the  worker’s  main  event  loop,  use  a  normal  ‘C  kej'board 
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interrupt.  At  that  point  you  can  stop  or  kill  the  inferior  server  by  invoking  stop-server  or 
kill-server,  respectively. 

However,  any  time  the  worker  enters  a  break,  you  must  know  which  task  is  currently 
executing.  If  the  worker  is  at  its  initial  task,  it  is  safe  to  choose  the  option  of  returning  to 
top  level.  However,  if  a  task  associated  with  a  derivation  is  currently  executing  and  you  don’t 
want  to  continue,  you  should  usually  select  the  kill  current  process  option.  (By  ‘‘process”, 
Lucid  means  a  Lucid  task.)  Then  you  will  end  up  with  the  dispatching  process  still  running. 

You  can  debug  a  standalone  worker  interactively.  To  do  so,  start  the  executable  with  a 
single  Unix  command-line  argument:  -interactive.  When  the  worker  starts  up  you  will 
get  an  interactive  Lisp  prompt,  and  can  use  usual-start-server,  and  so  on. 

4.9  How  to  Write  a  Deriver  in  Emacs  Lisp 

To  add  a  deriver  to  a  new  or  existing  EmaCS  worker,  perform  the  following  steps: 

1.  Write  the  deriver  functions  described  in  section  4.5. 

e  Section  4.9.2  describes  the  EmaCS  Lisp  versions  of  data  types  mentioned  in  4.5 
and  calling  conventions  for  the  deriver  functions. 

•  Section  4.9.3  and  the  document  Emacs  Lisp  Utilities  describe  various  utilities 
useful  for  writing  derivers. 

2.  Extend  an  existing  EmaCS  worker  with  your  deriver,  or  create  a  new  worker.  See 
section  4,9.1  for  details. 

3.  Debug  your  deriver  using  methods  described  in  section  4.9.4. 

4.9.1  How  to  Create  or  Extend  an  Emacs  Worker  Executable 
To  create  a  new  EmacS  worker  it  is  necessary  to  provide  an  Emacs  Lisp  file  which 

•  includes  the  top-level  form  (require-emacs-lisp  "elw"), 

•  defines  the  nullary  function  extenders-init, 

•  and  in  this  function  calls  enter-deriver  with  the  names  of  the  functions  that  consti¬ 
tute  your  deriver. 

Section  4.9.1  presents  a  simple  example. 

To  extend  an  existing  worker,  it  is  only  necessary  to  augment  the  extenders-init  func¬ 
tion  for  that  worker  with  a  call  to  enter-deriver  to  make  your  new  deriver  known  to  the 
worker. 
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•  «nter-deriver  deriver-name  f\  /j  /a  /j  /g  job-server-incremental  —* 

-  Register  the  deriver  named  deriver-name.  See  page  4-8  for  details. 

—  At  present  /g  must  be  nil. 

It  is  also  necessary  to  add  a  new  target  entry  to  the  following  Unix-program  artifact 

Emacs  Lisp  Worker  Monitor 

• -  Targets - - - — — _ _ 

8un4-dbg>nanuscript  [cc  (universal  artifact)]  -g  -Bstatic 
C  [cc  (universal  artifact)]  -g  -DLOAD.FILE^NAME="MS-derivers" 

Bips-dbg-nanuscript  [cc  (universal  artifact)]  -g 
C  [cc  (universal  artifact)]  -g  -DLOAD.FILE^NAME="MS-derivers" 

- Modules /Instructions - - - - 

[elw-parent] 

This  program  starts  the  EmaCS  process  by  loading  basic  and  a  single  file  provided  by 
you,  and  causes  Emacs  to  execute  the  function  elwrmain.  In  addition,  Emacs  Lisp  Worker 
Monitor  manages  the  standard  output  and  error  of  the  Emacs  process,  causing  RPC’s  to 
be  sent  out  on  standard  output,  causing  other  messages  to  be  sent  out  on  standard  error 
(whence  they  end  up  in  the  job  log).  If  all  the  functions  for  your  derivers  are  loaded  by 
loading  the  file  example .  el c,  then  you  would  add  targets  (one  for  each  architecture)  to 
Emacs  Lisp  Worker  Monitor,  each  of  which  compiles  it  with  the  switch 

-DLOAD.FILE_NAME= " example " 

The  result  would  be: 

Emacs  Lisp  Worker  Monitor  (with  new,  example  deriver  added) 

-  Targets - — - - - 

sun4-<ibg-manuscript  [cc  (universal  artifact)]  -g  -Bstatic 
C  [cc  (universal  artifact)]  -g  -DLOAD.FILE_NAHE=''MS-derivers" 
sun4-dbg-example  [cc  (universal  artifact)]  -g  -Bstatic 
C  [cc  (universal  artifact)]  -g  -DLOAD_FILE_NAME="example" 
mips-dbg-manuscript  [cc  (universal  artifact)]  -g 
C  [cc  (universal  artifact)]  -g  -DLOAD.FILE_KAME="MS-derivers" 
mips-dbg-example  [cc  (universal  artifact)]  -g 
C  [cc  (universal  artifact)]  -g  -DLOAD.FILE_NAME="example" 

- Modules /Instructions - — _ _ _ 

[elw-parent] 


Example  Emacs  Deriver  The  Emacs  code  for  the  example  deriver  could  be  the  follow¬ 
ing: 


(require-emacs-lisp  "elw") 


emacs-example 
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(defun  extenders- init  ()  emacs- example 

(enter-deriver  "exaople-k"  nil  nil  nil  nil  ’example-derive  nil  t)) 

(defun  example-derive  (dossier  relative-k-D-pairs  hint-list)  emacs-example 

(vector  "dome  derivative")) 

4.9.2  Types 

We  enumerate  the  Emacs  Lisp  representations  of  the  language-independent  types  used 
above. 

e  body — string 

a  boolean  1+1  hint — Arbitrary  lisp  value;  the  symbol  t  is  boolean  true,  the  symbol  nil  is 
boolean  false,  and  any  other  value  is  a  hint. 

e  cell — list,  with  slot  access  functions.  The  functions  cell-slot  take  a  cell  as  argument 
and  return  the  value  of  slot.  Valid  values  for  slot  are 

—  pointer 

—  body 

—  references 

—  timestamp 

—  type 

—  name 

—  creator 

e  derivative-string — string 

e  dossier — list,  with  slot  access  functions.  The  functions  dossier-slot  take  a  dossier  as 
argument  and  return  the  value  of  slot.  Valid  values  for  slot  are 

—  body 

—  contents 

—  requisite-k-d 

—  reference-h-k-d 

—  cell 

The  cell  used  to  be  private,  but  too  many  derivers  required  access  to  it. 

•  handle-kind-derivative-sequence — a  vector  of  lists,  each  of  the  form 

(.handle  kind  derivative-string). 
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•  kandle-kind-sequence — a  vector  of  lists,  each  of  the  form 

(handle  kind). 

•  hint-set — a  list  of  arbitrary  values,  excluding  the  symbols  t  and  nil. 

•  integer — integer. 

•  handle — integer. 

•  kind — string. 

•  kind- derivative-set- sequence — an  a-list  from  kind  (string)  to  a  list  of  derivative  strings. 

•  kind/uid-request-sequence — a  vector  of  lists,  each  of  the  form 

(kind  integer). 

•  location — a  cons  cell;  the  deriver  must  modify  only  the  car;  never  the  cdr. 

•  uids — a  vector  of  vectors  of  strings. 

4.9.3  Relevant  Utilities 

See  the  Emacs  Lisp  Utilities  document  for  general-purpose  utilities.  Otherwise  there  is  only 
a  single  deriver-specific  utility: 

•  filename-and-uid  flag  area  &optional  (extn  nil)  — »  (filename  .  uid) 

—  See  section  4.6. 

4.9.4  Debugging  A  Deriver 

An  Emacs  worker  is  debugged  in  an  interactive  EmacS  session,  from  which  an  inferior  job 
server  can  be  started. 

The  connection  to  the  job  server  (while  debugging)  is  fairly  fragile.  The  most  difficult 
problem  is  that  Emacs  Lisp  can’t  tell  the  difference  between  the  standard  error  and  the 
standard  output  of  the  job  server.  Therefore  when  the  job  server  writes  log  messages,  they 
go  to  the  worker’s  RPC  input.  This  is  not  so  bad,  because  the  worker  understands  these  log 
messages  as  RPCs,  and  acts  appropriately.  The  problem  arises  when  the  job  server  writes  a 
log  message  while  processing  an  RPC  that  has  a  reply.  In  this  case  the  worker  interprets  the 
log  message  as  an  (illegal)  reply  to  the  RPC.  There  seems  to  be  no  way  around  this  without 
augmenting  the  standard  RPC  mechanism. 

Messages  from  both  the  worker  and  the  job  server  appear  in  the  buff'er  named  *elw-log*. 
To  debug  an  EmacS  worker  we  recommend  the  following  procedure: 

1.  Make  sure  that  the  derivers  and  workers  plexes  are  set  up  correctly. 
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2.  Using  e-1  -do,  start  an  Emacs  (not  Epoch)  session. 

3.  Load  the  Emacs  Lisp  file  basic  (with  M-x  load-file  $E_L_BIN/emacs-lisp/basic), 
and  then  load  the  Emacs  Lisp  file  that  defines  the  derivers  (which  must  require- 
emacs-lisp  elw). 

4.  Issue  the  command  M-x  init  (which  sets  debug-on-error  to  t). 

5.  Issue  the  command  M-x  start-server  to  start  an  inferior  job  server.  When  prompted, 
enter  the  name  of  the  worker  being  debugged  (from  the  workers  plex),  and  a  name 
for  the  current  host  that  is  associated  with  the  desired  architecture  in  the  hosts  and 
workers  plexes. 

This  also  starts  the  worker’s  main  event  loop;  you  must  use  C-g  to  quit  this  loop. 

6.  The  command  M-x  stop-server  stops  an  inferior  job  server  that  has  no  jobs. 

7.  The  command  M-x  cleainup-servers  kills  (with  SIGINT)  any  job  servers  that  the 
worker  still  knows  about. 

8.  There  is  also  a  M-x  kill-server  command,  which  will  kill  the  most  recently  started 
job  server  (if  it  still  exists). 
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A  Thumbnail  Sketches  of  Umpteen  Derivers 

The  sections  below  have  titles  of  the  form  t,  k  possibly  followed  by  a  qualifier.  Such  a  section 
describes  the  functional  arguments  given  to  the  universaljderiver  to  compute  from  an 
artifact  of  type  t  a  derivative  of  kind  k.  The  artifact  under  derivation  will  be  denoted  p,  and 
in  compare-relative^  the  relative  is  p'. 

Each  section  below  gives  a  brief  sketch  of  a  deriver,  including,  when  A:  is  a  single  derivative 
class,  the  representation  of  the  derivative  class  (as  seen  in  the  derivative-classes  plex), 
^uld  in  all  cases  the  kind-re  and  kind-re-replacement  (as  seen  in  the  derivers  plex)  and  a 
description  of  the  functional  arguments  to  the  universal-deriver.  These  summaries  are 
not  meant  to  stand  alone,  but  to  facilitate  comparisons  of  different  derivers,  and  to  suggest 
strategies  for  a  deriver  you  might  want  to  write. 

A.l  t,  caption  for  Many  Types  t 

See  the  document  ^TgX  Artifacts  for  a  more  complete  discussion, 
e  representation:  [“\0\1\12]* 

•  kind-re,  kind-re-replacement:  caption 

•  computable-kinds:  null,  because  caption  is  the  only  kind. 

•  requisite-kinds:  null,  no  derivatives  necessary  from  p. 

•  reference-handle-kinds:  null,  because  the  caption  does  not  depend  upon  references. 

•  compare-relative:  null,  because  with  a  null  set  of  handle/kind  pairs,  this  would  be 
called  only  when  the  bodies  are  different,  in  which  case,  nothing  is  known  about  the 
caption. 

•  derive:  read  the  body,  and  for  most  types,  use  the  first  line,  perhaps  stripping  off  some 
comment  characters;  ignore  derivatives  of  predecessors,  because  there  are  no  uid’s  to 
be  re-used. 

A. 2  t,  manuscript  for  Many  Types  t 

This  section  applies  to  artifacts  whose  contents  are  set  in  \tt  font,  except  for  its  captions. 

•  representation:  See  section  A. 3  for  manuscript  representation.  In  the  cases  considered 
here,  the  derivatives  will  have  the  representation  L /,  where  /.  is  a  file  in  the  meuiuscript 
directory  containing  the  I^TgX  source. 

•  kind-re,  kind-re-replacement:  manuscript 

•  computable-kinds,  requisite-kinds:  null 

•  reference-handle-kinds:  uniform-handle-kinds(0, n  —  1, caption). 
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•  compare-relative:  null,  because  all  of  the  body  and  all  of  the  references’  derivatives 
affect  the  manuscript  derivatives. 

•  derive:  obtain  a  new  uid  /,  open  a  stream  on  it,  and: 

—  Output  some  initialization  to  the  stream  that  will  establish  a  scope  in  which  page 
breaics  will  not  occur. 

—  Output  \tt  and  a  newline  to  the  stream. 

—  Output  the  body  to  the  stream,  quoting  all  special  I^TgX  characters. 

—  When  encountering  a  reference,  output  its  caption,  surrounded  by  braces  and  a 
change  of  the  font  to  \nn. 

—  Output  the  close  of  scope  to  the  stream,  close  the  file. 

—  Let  the  kind-derivative-set-sequence  parameter  be  {(manuscript, L/')},  and  let 
f"  be  the  result  of  compare-files(/, /').  The  result  is  {{Lf")}. 

A.3  latex-piece,  manuscript 

See  the  document  MgX  Artifacts  for  a  more  complete  discussion. 

•  representation:  This  is  wildly  complicated,  to  provide  for  efficient  incremental  deriva¬ 
tion  in  highly  tree-structured  documents.  It  is  included  only  for  completeness,  and  to 
show  how  complicated  derivatives  can  be: 


’Lf . . 

I’Pf..’  [“\0\1\12]* 

I  ’Bb{latex-piece} . . 

I  ’N(: Inode: s. .< 
bflatex-piece}.  . 

( ’u 

(’C[“\0\l\12]*’\l 
I  ’Lf . . 

1 ’Bb{latex-piece} . . 
I ’P[“\0\l\12]*’\l 
I ’N=lnode=) 

)+>) 

(’/:pnode:s< 

(CO-93  + 

(’:f.. 

I ’ /*pnode*) 

)+<»  » 

I) 


A  I^TjX  source  file  (ending  with  dot) 
A  PostScript  file  and  a  macro  string 
A  latex-piece  body. 

A  file  having  . . . 
a  latex-piece  body,  and 
a  list  of  space  followed  by 
a  caption,  or 
a  ^TgX  source  file,  or 
a  latex-piece  body,  or 
a  macro  string, 
recursively,  a  node  file. 

End  list  in  the  file. 

Optionally,  another  file  with 
a  part  index 
a  PostScript  file,  or 
a  PostScript  node. 

End  space- separated  list,  end  file. 

The  no  file  option. 


•  kind-re,  kind-re-replacement:  meinuscript 
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•  computable-kinds,  requisite-kinds:  null 

•  reference-handle-kinds:  The  kinds  here  aire  not  quite  uniform — for  each  handle,  the 
kind  is  manuscript,  unless  the  artifact  specifies  <caption>,  in  which  case  it  is,  surprise, 
caption. 

There  are  useful  hints  for  this  deriver:  a  uid  of  an  Inode  (the  I^T£X  part)  or  the  uid  of  a 
pnode  (the  PostScript  part).  (If  a  single  predecessor  has  both,  its  derivative  is  good  for  p. ) 

•  compare-relative: 

—  Let  the  kind-derivative-sequence  argument  be  {(manuscript,  m')}.  If  m'  is  not  of 
the  form  NnVr',  yield  false  (for  rectsons  too  subtle  to  go  into  here). 

—  If  p'  and  p  have  the  same  bodies  and  if  the  “I^^TgX  part”  of  the  derivatives  of 
references  of  p  and  the  contents  of  n'  are  sufficiently  similar,  yield  n'  as  the  hint. 

—  If  “PostScript  part”  of  the  derivatives  of  references  of  p  and  the  contents  of  r'  are 
sufficiently  similar,  yield  r'  as  the  hint. 

—  Otherwise,  yield  false. 

•  derive: 

-  If  the  result  can  be  constructed  by  using  pieces  of  hints,  that  is  the  result. 

—  Let  b  be  the  body  of  p. 

—  Up  has  no  references,  the  result  is  (Bb). 

—  If  there  is  an  n  among  the  hints,  it  will  be  the  n  part  of  the  result.  Otherwise, 
get  a  new  uid  n  (in  manuscript)  for  an  Inode.  The  first  item  in  its  contents  will 
be  6.  Construct  the  rest  of  the  contents  of  n  from  the  derivatives  of  references  of 
p,  and  while  doing  so,  keep  track  of  any  PostScript  that  is  encountered. 

—  If  no  PostScript  parts  were  encountered,  the  result  is  (Nn). 

—  Otherwise,  if  there  is  an  r  among  the  hints,  it  will  be  the  r  of  the  result.  If  there 
is  no  such  r,  get  a  new  uid  r  (in  mamuscript)  for  a  pnode,  and  fill  in  its  contents 
using  the  PostScript  encountered  in  the  previous  step.  The  result  is  (Nn/r). 


A.4  latex-root,  latex-directory 

•  representation:  Not  quite  as  bad  as  manuscript; 


d'uC0-9]+ 

('u 

rpcontents: 

(([0-9]+ 

('  :f{meuiuscript} . . 

I Vs{m2muscript><=pcontents=>) 
)+<'  >) 

I) 


A  directory  and  a  magic  integer. 
Beginning  of  possible  list. 

Contents  of  a  pnode. 
a  handle  index 
a  PostScript  file,  or 
a  PostScript  node. 

End  space- separated  list. 

End  of  possible  list  and  null  option. 
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•  kind-re,  kind-re-replacement:  latex-directory 

e  computable-kinds,  requisite-kinds:  null 

e  reference-handle-kinds:  for  most  handles,  the  manuscript  is  needed;  exceptions  are 
the  X  derivative  for  references  marked  with  <x>,  for  x  =  bibliography  and  caption. 

A  hint  for  this  derivation  consists  of  the  uid  of  an  existing  derivative,  a  flag  saying  whether 
the  bibliography  information  is  valid  for  p,  map  from  new  part  indices  to  old  part  indices, 
a  vector  of  flags  saying  whether  the  new  part  is  “^TgX-equivalent”  to  the  old,  and  a  flag 
saying  whether  everything  except  the  PostScript  information  is  the  same. 

•  compare-relative: 

—  Look  at  the  bodies  of  p'  and  p,  and  the  derivatives  of  bibliographic  references  to 
compute  the  bibliography  flag. 

—  To  obtain  the  map  from  the  new  to  old  parts,  determine  historical  correspondences 
between  the  references  of  p'  and  those  of  p,  where  in  each  Ccise  the  references  are 
those  after  the  Nbeginfdocument}.  (The  magic  integer  in  the  derivative  makes 
locating  these  references  easy.) 

-  The  vector  of  flags  for  I^^T£X-equi valence  is  computed  by  looking  at  the  body  of 
p  and  the  derivatives  of  references  occurring  after  Xbeginfdocument}. 

-  The  all-ok-except-maybe-PostScript  flag  is  true  if  and  only  if  the  map  from  new 
to  old  parts  is  the  identity  vector,  the  magic  number  for  p'  matches  what  would  be 
the  magic  number  for  p,  the  bibliography  flag  is  true,  and  the  lATgX-equivalence 
vector  is  all  true. 

•  derive: 

—  If  there  is  a  hint  in  which  all  that  has  changed  is  the  PostScript,  it  is  not  necessary 
to  set  up  a  new  directory.  The  new  derivative  string  can  use  the  old  uid  for  the 
directory,  and  re-use  any  of  the  PostScript  data,  if  possible. 

—  Otherwise,  set  up  a  new  directory  dir. 

—  If  there  is  no  hint,  initialize  dir  and  run  I^TgX  in  it. 

—  If  there  is  more  than  one  hint  (oh  lucky  day),  the  best  one  is  that  in  which  the 
derivatives  of  the  header  references  are  identical.  Other  than  this,  a  single  hint  is 
picked  arbitrarily. 

—  Initialize  dir  to  just  the  differences  indicated  by  the  hint,  and  run  l^TgX  only 
where  the  hint  indicates  a  necessity. 

There  is  substantial  omission  of  detail  here.  I^TgX  is  probably  the  most  difficult  tool  to 
integrate. 
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A. 5  shellscript,  k  for  k  ^  manuscript 

The  manuscript  derivative  of  a  shellscript  artifact  is  not  obtained  by  running  it,  but  ^ls 
described  in  section  A. 2.  For  all  other  derivatives; 

•  representation:  A  shellscript  artifact  can  produce  any  kind  of  derivative  whose 
representation  is  f,  f<er<cnsion>,  or  d.  It  also  always  produces  standard-output  and 
standzurd-error  derivatives,  whose  representations  are  each  f . 

e  kind-re,  kind-re-replacement:  \"manuscript 

e  computable-kinds: 

—  Read  the  contents  of  the  artifact 

—  Yield  {st2Uidard-output,  standard-error}  U  {k  |  0{fc}  is  in  the  contents) 
e  requisite-kinds:  null 

e  reference-handle-kinds:  {{h,k)  \  h{k]  is  in  the  contents} 
e  derive: 

—  Replace  each  hik}  in  the  contents  with  the  corresponding  derivative,  using  the 
extension  from  the  representation,  if  there  is  one. 

-  Replace  each  C{fc}  in  the  contents  with  string  obtained  by  substituing  a  new  uid 
for  the  \f  or  \d  in  the  representation  field  of  the  derivative  class  of  k. 

—  Run  the  resulting  shell-script  directing  standard  output  and  standard  error  to 
new  uids  for  standard- output  and  stemdard-error. 

—  Let  {{derive-km',Dm)}m  denote  the  value  of  the  kind-derivative-set-sequence  ar¬ 
gument  and  let  {{kj,dj)}j  denote  the  kind-derivative-sequence.  If  any  kj  is  not 
among  the  derive-km,  delete  the  file  dj.  Otherwise,  yield  compare-f iles(dj.  Dm) 
as  the  result  corresponding  to  derive-km  ■ 
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